Authors: Martin Morgan (mtmorgan@fredhutch.org), Sonali Arora (sarora@fredhutch.org)
Date: 30 June, 2015

Analysis & Comprehension of High Throughput Sequence Data

Overall workflow

  1. Experimental design
    • Keep it simple!
    • Replication!
    • Avoid or track batch effects
  2. Wet-lab preparation
  3. High-throughput sequencing
    • Output: FASTQ files of reads and their quality scores
  4. Alignment
    • Many different aligners, some specialized for different purposes
    • Output: BAM files of aligned reads
  5. Summary
    • e.g., count of reads overlapping regions of interest (e.g., genes)
  6. Statistical analysis
  7. Comprehension

Alt Sequencing Ecosystem

Notes

Two simple shiny apps

How Bioconductor helps

Annotation

Standard (large) file input & manipulation, e.g., BAM files of aligned reads

Statistical analysis of differential expression

Annotation

Gene models – TxDb, GRanges, and GRangesList

Gene model annotation resources – TxDb packages

TxDb.Hsapiens.UCSC.hg19.knownGene

library("TxDb.Hsapiens.UCSC.hg19.knownGene")
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
txdb
## TxDb object:
## # Db type: TxDb
## # Supporting package: GenomicFeatures
## # Data source: UCSC
## # Genome: hg19
## # Organism: Homo sapiens
## # UCSC Table: knownGene
## # Resource URL: http://genome.ucsc.edu/
## # Type of Gene ID: Entrez Gene ID
## # Full dataset: yes
## # miRBase build ID: GRCh37
## # transcript_nrow: 82960
## # exon_nrow: 289969
## # cds_nrow: 237533
## # Db created by: GenomicFeatures package from Bioconductor
## # Creation time: 2015-03-19 13:55:51 -0700 (Thu, 19 Mar 2015)
## # GenomicFeatures version at creation time: 1.19.32
## # RSQLite version at creation time: 1.0.0
## # DBSCHEMAVERSION: 1.1
methods(class=class(txdb))
##  [1] $                      $<-                    annotatedDataFrameFrom as.list               
##  [5] asBED                  asGFF                  assayData              assayData<-           
##  [9] cds                    cdsBy                  cdsByOverlaps          coerce                
## [13] columns                combine                contents               dbconn                
## [17] dbfile                 dbInfo                 dbmeta                 dbschema              
## [21] disjointExons          distance               exons                  exonsBy               
## [25] exonsByOverlaps        ExpressionSet          extractUpstreamSeqs    featureNames          
## [29] featureNames<-         fiveUTRsByTranscript   genes                  initialize            
## [33] intronsByTranscript    isActiveSeq            isActiveSeq<-          isNA                  
## [37] keys                   keytypes               mapIds                 mappedkeys            
## [41] mapToTranscripts       metadata               microRNAs              nhit                  
## [45] organism               promoters              revmap                 sample                
## [49] sampleNames            sampleNames<-          saveDb                 select                
## [53] seqinfo                seqinfo<-              seqlevels0             show                  
## [57] species                storageMode            storageMode<-          threeUTRsByTranscript 
## [61] transcripts            transcriptsBy          transcriptsByOverlaps  tRNAs                 
## [65] updateObject          
## see '?methods' for accessing help and source code

TxDb objects

Accessing gene models

  • exons(), transcripts(), genes(), cds() (coding sequence)
  • promoters() & friends
  • exonsBy() & friends – exons by gene, transcript, …
  • ‘select’ interface: keytypes(), columns(), keys(), select(), mapIds()

Genomic ranges – GRanges

exons(): GRanges

exons(txdb)
## GRanges object with 289969 ranges and 1 metadata column:
##            seqnames               ranges strand   |   exon_id
##               <Rle>            <IRanges>  <Rle>   | <integer>
##        [1]     chr1       [11874, 12227]      +   |         1
##        [2]     chr1       [12595, 12721]      +   |         2
##        [3]     chr1       [12613, 12721]      +   |         3
##        [4]     chr1       [12646, 12697]      +   |         4
##        [5]     chr1       [13221, 14409]      +   |         5
##        ...      ...                  ...    ... ...       ...
##   [289965]     chrY [27607404, 27607432]      -   |    277746
##   [289966]     chrY [27635919, 27635954]      -   |    277747
##   [289967]     chrY [59358329, 59359508]      -   |    277748
##   [289968]     chrY [59360007, 59360115]      -   |    277749
##   [289969]     chrY [59360501, 59360854]      -   |    277750
##   -------
##   seqinfo: 93 sequences (1 circular) from hg19 genome

Alt Genomic Ranges

methods(class="GRanges"): 100’s!

GRanges Algebra

  • Intra-range methods
    • Independent of other ranges in the same object
    • GRanges variants strand-aware
    • shift(), narrow(), flank(), promoters(), resize(), restrict(), trim()
    • See ?"intra-range-methods"
  • Inter-range methods
    • Depends on other ranges in the same object
    • range(), reduce(), gaps(), disjoin()
    • coverage() (!)
    • see ?"inter-range-methods"
  • Between-range methods
    • Functions of two (or more) range objects
    • findOverlaps(), countOverlaps(), …, %over%, %within%, %outside%; union(), intersect(), setdiff(), punion(), pintersect(), psetdiff()

Lists of genomic ranges – GRangesList

exonsBy(): GRangesList

exonsBy(txdb, "tx")
## GRangesList object of length 82960:
## $1 
## GRanges object with 3 ranges and 3 metadata columns:
##       seqnames         ranges strand |   exon_id   exon_name exon_rank
##          <Rle>      <IRanges>  <Rle> | <integer> <character> <integer>
##   [1]     chr1 [11874, 12227]      + |         1        <NA>         1
##   [2]     chr1 [12613, 12721]      + |         3        <NA>         2
##   [3]     chr1 [13221, 14409]      + |         5        <NA>         3
## 
## $2 
## GRanges object with 3 ranges and 3 metadata columns:
##       seqnames         ranges strand | exon_id exon_name exon_rank
##   [1]     chr1 [11874, 12227]      + |       1      <NA>         1
##   [2]     chr1 [12595, 12721]      + |       2      <NA>         2
##   [3]     chr1 [13403, 14409]      + |       6      <NA>         3
## 
## $3 
## GRanges object with 3 ranges and 3 metadata columns:
##       seqnames         ranges strand | exon_id exon_name exon_rank
##   [1]     chr1 [11874, 12227]      + |       1      <NA>         1
##   [2]     chr1 [12646, 12697]      + |       4      <NA>         2
##   [3]     chr1 [13221, 14409]      + |       5      <NA>         3
## 
## ...
## <82957 more elements>
## -------
## seqinfo: 93 sequences (1 circular) from hg19 genome

Alt Genomic Ranges

Algebra of genomic ranges

GRanges / GRangesList are incredibly useful

  • Represent annotations – genes, variants, regulatory elements, copy number regions, …
  • Represent data – aligned reads, ChIP peaks, called variants, …

Many biologically interesting questions represent operations on ranges

  • Count overlaps between aligned reads and known genes – GenomicRanges::summarizeOverlaps()
  • Genes nearest to regulatory regions – GenomicRanges::nearest(), ChIPseeker
  • Called variants relevant to clinical phenotypes – VariantFiltering

Alt Ranges Algebra

Identifier mapping – OrgDb

library(org.Hs.eg.db)
org.Hs.eg.db
## OrgDb object:
## | DBSCHEMAVERSION: 2.1
## | Db type: OrgDb
## | Supporting package: AnnotationDbi
## | DBSCHEMA: HUMAN_DB
## | ORGANISM: Homo sapiens
## | SPECIES: Human
## | EGSOURCEDATE: 2015-Mar17
## | EGSOURCENAME: Entrez Gene
## | EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
## | CENTRALID: EG
## | TAXID: 9606
## | GOSOURCENAME: Gene Ontology
## | GOSOURCEURL: ftp://ftp.geneontology.org/pub/go/godatabase/archive/latest-lite/
## | GOSOURCEDATE: 20150314
## | GOEGSOURCEDATE: 2015-Mar17
## | GOEGSOURCENAME: Entrez Gene
## | GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
## | KEGGSOURCENAME: KEGG GENOME
## | KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
## | KEGGSOURCEDATE: 2011-Mar15
## | GPSOURCENAME: UCSC Genome Bioinformatics (Homo sapiens)
## | GPSOURCEURL: ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19
## | GPSOURCEDATE: 2010-Mar22
## | ENSOURCEDATE: 2015-Mar13
## | ENSOURCENAME: Ensembl
## | ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
## | UPSOURCENAME: Uniprot
## | UPSOURCEURL: http://www.UniProt.org/
## | UPSOURCEDATE: Tue Mar 17 18:48:15 2015
## 
## Please see: help('select') for usage information

OrgDb objects

select()

Related functionality

Other annotation resources – biomaRt, AnnotationHub

biomaRt & friends

http://biomart.org; Bioconductor package biomaRt

## NEEDS INTERNET ACCESS !!
library(biomaRt)
head(listMarts(), 3)                      ## list marts
head(listDatasets(useMart("ensembl")), 3) ## mart datasets
ensembl <-                                ## fully specified mart
    useMart("ensembl", dataset = "hsapiens_gene_ensembl")

head(listFilters(ensembl), 3)             ## filters
myFilter <- "chromosome_name"
substr(filterOptions(myFilter, ensembl), 1, 50) ## return values
myValues <- c("21", "22")
head(listAttributes(ensembl), 3)          ## attributes
myAttributes <- c("ensembl_gene_id","chromosome_name")

## assemble and query the mart
res <- getBM(attributes =  myAttributes, filters =  myFilter,
             values =  myValues, mart = ensembl)

Other internet resources

AnnotationHub

  • Bioconductor package AnnotationHub
  • Meant to ease use of ‘consortium’ and other genome-scale resources
  • Simplify discovery, retrieval, local management, and import to standard Bioconductor representations

Example: Ensembl ‘GTF’ files to R / Bioconductor GRanges and TxDb

library(AnnotationHub)
hub <- AnnotationHub()
hub
query(hub, c("Ensembl", "80", "gtf"))
## ensgtf = display(hub)                   # visual choice
hub["AH47107"]
gtf <- hub[["AH47107"]]
gtf
txdb <- GenomicFeatures::makeTxDbFromGRanges(gtf)

Example: non-model organism OrgDb packages

library(AnnotationHub)
hub <- AnnotationHub()
query(hub, "OrgDb")

Example: Map Roadmap epigenomic marks to hg28

  • Roadmap BED file as GRanges

    library(AnnotationHub)
    hub <- AnnotationHub()
    query(hub , c("EpigenomeRoadMap", "E126", "H3K4ME2"))
    E126 <- hub[["AH29817"]]
  • UCSC ‘liftOver’ file to map coordinates

    query(hub , c("hg19", "hg38", "chainfile"))
    chain <- hub[["AH14150"]]
  • lift over – possibly one-to-many mapping, so GRanges to GRangesList

    library(rtracklayer)
    E126hg38 <- liftOver(E126, chain)
    E126hg38

Input & representation of standard file formats

BAM files of aligned reads – GenomicAlignments

Recall: overall workflow

  1. Experimental design
  2. Wet-lab preparation
  3. High-throughput sequencing
  4. Alignment
    • Whole genome, vs. transcriptome
  5. Summary
  6. Statistical analysis
  7. Comprehension

BAM files of aligned reads

GenomicAlignments

Other formats and packages

Alt Files and the Bioconductor packages that input them

Large data – BiocParallel, GenomicFiles

Restriction

  • Input only the data necessary, e.g., ScanBamParam()
  • which: genomic ranges of interest
  • what: ‘columns’ of BAM file, e.g., ‘seq’, ‘flag’

Iteration

  • Read entire file, but in chunks
  • Chunk size small enough to fit easily in memory,
  • Chunk size large enough to benefit from R’s vectorized operations – 10k to 1M records at at time
  • e.g., BamFile(..., yieldSize=100000)

Iterative programming model

  • yield a chunk of data
  • map input data to convenient representation, often summarizing input to simplified form
    • E.g., Aligned read coordinates to counts overlapping regions of interest
    • E.g., Aligned read sequenced to GC content
  • reduce across mapped chunks
  • Use GenomicFiles::reduceByYield()

    library(GenomicFiles)
    
    yield <- function(bfl) {
        ## input a chunk of alignments
        library(GenomicAlignments)
        readGAlignments(bfl, param=ScanBamParam(what="seq"))
    }
    
    map <- function(aln) { 
        ## Count G or C nucleotides per read
        library(Biostrings)
        gc <- letterFrequency(mcols(aln)$seq, "GC")
        ## Summarize number of reads with 0, 1, ... G or C nucleotides
        tabulate(1 + gc, 73)                # max. read length: 72
    }
    
    reduce <- `+`
  • Example

    library(RNAseqData.HNRNPC.bam.chr14)
    fls <- RNAseqData.HNRNPC.bam.chr14_BAMFILES
    bf <- BamFile(fls[1], yieldSize=100000)
    gc <- reduceByYield(bf, yield, map, reduce)
    plot(gc, type="h",
         xlab="GC Content per Aligned Read", ylab="Number of Reads")

Parallel evaluation

  • Cores, computers, clusters, clouds
  • Generally, requires memory management techniques like restriction or iteration – parallel processes competing for shared memory
  • Many problems are embarassingly parallellapply()-like – especially in bioinformatics where parallel evaluation is across files

  • Example: GC content in several BAM files

    library(BiocParallel)
    gc <- bplapply(BamFileList(fls), reduceByYield, yield, map, reduce)
    
    library(ggplot2)
    df <- stack(as.data.frame(lapply(gc, cumsum)))
    df$GC <- 0:72
    ggplot(df, aes(x=GC, y=values)) + geom_line(aes(colour=ind)) +
        xlab("Number of GC Nucleotides per Read") +
        ylab("Number of Reads")

Statistical analysis of differential expression – DESeq2

  1. Experimental design
  2. Wet-lab preparation
  3. High-throughput sequencing
  4. Alignment
    • Whole-genome, or transcriptome
  5. Summary
    • Count reads overlapping regions of interest: GenomicAlignments::summarizeOverlaps()
  6. Statistical analysis
  7. Comprehension

More extensive material

Challenges & solutions

Starting point

Normalization

Error model

Limited sample size

Multiple testing

Work flow

Data representation

Three types of information

  • A matrix of counts of reads overlapping regions of interest
  • A data.frame summarizing samples used in the analysis
  • GenomicRanges describing the regions of interest

SummarizedExperiment coordinates this information

  • Coordinated management of three data resources
  • Easy integration with other Bioconductor software

library("airway")
data(airway)
airway
## class: SummarizedExperiment 
## dim: 64102 8 
## exptData(1): ''
## assays(1): counts
## rownames(64102): ENSG00000000003 ENSG00000000005 ... LRG_98 LRG_99
## rowRanges metadata column names(0):
## colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
## colData names(9): SampleName cell ... Sample BioSample
## main components of SummarizedExperiment
head(assay(airway))
##                 SRR1039508 SRR1039509 SRR1039512 SRR1039513 SRR1039516 SRR1039517 SRR1039520
## ENSG00000000003        679        448        873        408       1138       1047        770
## ENSG00000000005          0          0          0          0          0          0          0
## ENSG00000000419        467        515        621        365        587        799        417
## ENSG00000000457        260        211        263        164        245        331        233
## ENSG00000000460         60         55         40         35         78         63         76
## ENSG00000000938          0          0          2          0          1          0          0
##                 SRR1039521
## ENSG00000000003        572
## ENSG00000000005          0
## ENSG00000000419        508
## ENSG00000000457        229
## ENSG00000000460         60
## ENSG00000000938          0
colData(airway)
## DataFrame with 8 rows and 9 columns
##            SampleName     cell      dex    albut        Run avgLength Experiment    Sample
##              <factor> <factor> <factor> <factor>   <factor> <integer>   <factor>  <factor>
## SRR1039508 GSM1275862   N61311    untrt    untrt SRR1039508       126  SRX384345 SRS508568
## SRR1039509 GSM1275863   N61311      trt    untrt SRR1039509       126  SRX384346 SRS508567
## SRR1039512 GSM1275866  N052611    untrt    untrt SRR1039512       126  SRX384349 SRS508571
## SRR1039513 GSM1275867  N052611      trt    untrt SRR1039513        87  SRX384350 SRS508572
## SRR1039516 GSM1275870  N080611    untrt    untrt SRR1039516       120  SRX384353 SRS508575
## SRR1039517 GSM1275871  N080611      trt    untrt SRR1039517       126  SRX384354 SRS508576
## SRR1039520 GSM1275874  N061011    untrt    untrt SRR1039520       101  SRX384357 SRS508579
## SRR1039521 GSM1275875  N061011      trt    untrt SRR1039521        98  SRX384358 SRS508580
##               BioSample
##                <factor>
## SRR1039508 SAMN02422669
## SRR1039509 SAMN02422675
## SRR1039512 SAMN02422678
## SRR1039513 SAMN02422670
## SRR1039516 SAMN02422682
## SRR1039517 SAMN02422673
## SRR1039520 SAMN02422683
## SRR1039521 SAMN02422677
rowRanges(airway)
## GRangesList object of length 64102:
## $ENSG00000000003 
## GRanges object with 17 ranges and 2 metadata columns:
##        seqnames               ranges strand   |   exon_id       exon_name
##           <Rle>            <IRanges>  <Rle>   | <integer>     <character>
##    [1]        X [99883667, 99884983]      -   |    667145 ENSE00001459322
##    [2]        X [99885756, 99885863]      -   |    667146 ENSE00000868868
##    [3]        X [99887482, 99887565]      -   |    667147 ENSE00000401072
##    [4]        X [99887538, 99887565]      -   |    667148 ENSE00001849132
##    [5]        X [99888402, 99888536]      -   |    667149 ENSE00003554016
##    ...      ...                  ...    ... ...       ...             ...
##   [13]        X [99890555, 99890743]      -   |    667156 ENSE00003512331
##   [14]        X [99891188, 99891686]      -   |    667158 ENSE00001886883
##   [15]        X [99891605, 99891803]      -   |    667159 ENSE00001855382
##   [16]        X [99891790, 99892101]      -   |    667160 ENSE00001863395
##   [17]        X [99894942, 99894988]      -   |    667161 ENSE00001828996
## 
## ...
## <64101 more elements>
## -------
## seqinfo: 722 sequences (1 circular) from an unspecified genome
## e.g., coordinated subset to include dex 'trt'  samples
airway[, airway$dex == "trt"]
## class: SummarizedExperiment 
## dim: 64102 4 
## exptData(1): ''
## assays(1): counts
## rownames(64102): ENSG00000000003 ENSG00000000005 ... LRG_98 LRG_99
## rowRanges metadata column names(0):
## colnames(4): SRR1039509 SRR1039513 SRR1039517 SRR1039521
## colData names(9): SampleName cell ... Sample BioSample
## e.g., keep only rows with non-zero counts
airway <- airway[rowSums(assay(airway)) != 0, ]

DESeq2 work flow

  1. Add experimental design information to the SummarizedExperiment

    library(DESeq2)
    dds <- DESeqDataSet(airway, design = ~ cell + dex)
  2. Peform the essential work flow steps

    dds <- DESeq(dds)
    ## estimating size factors
    ## estimating dispersions
    ## gene-wise dispersion estimates
    ## mean-dispersion relationship
    ## final dispersion estimates
    ## fitting model and testing
    dds
    ## class: DESeqDataSet 
    ## dim: 33469 8 
    ## exptData(1): ''
    ## assays(3): counts mu cooks
    ## rownames(33469): ENSG00000000003 ENSG00000000419 ... ENSG00000273492 ENSG00000273493
    ## rowRanges metadata column names(46): baseMean baseVar ... deviance maxCooks
    ## colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
    ## colData names(10): SampleName cell ... BioSample sizeFactor
  3. Extract results

    res <- results(dds)
    res
    ## log2 fold change (MAP): dex untrt vs trt 
    ## Wald test p-value: dex untrt vs trt 
    ## DataFrame with 33469 rows and 6 columns
    ##                    baseMean log2FoldChange      lfcSE       stat       pvalue        padj
    ##                   <numeric>      <numeric>  <numeric>  <numeric>    <numeric>   <numeric>
    ## ENSG00000000003 708.6021697     0.37424998 0.09873107  3.7906000 0.0001502838 0.001164352
    ## ENSG00000000419 520.2979006    -0.20215550 0.10929899 -1.8495642 0.0643763883 0.181989704
    ## ENSG00000000457 237.1630368    -0.03624826 0.13684258 -0.2648902 0.7910940570 0.901775018
    ## ENSG00000000460  57.9326331     0.08523370 0.24654400  0.3457140 0.7295576915 0.868545776
    ## ENSG00000000938   0.3180984     0.11555962 0.14630523  0.7898530 0.4296136448          NA
    ## ...                     ...            ...        ...        ...          ...         ...
    ## ENSG00000273487   8.1632350    -0.56331132  0.3736236 -1.5076976    0.1316319   0.3066033
    ## ENSG00000273488   8.5844790    -0.10805538  0.3684853 -0.2932420    0.7693372   0.8900081
    ## ENSG00000273489   0.2758994    -0.11282164  0.1424265 -0.7921393    0.4282794          NA
    ## ENSG00000273492   0.1059784     0.07644378  0.1248627  0.6122225    0.5403906          NA
    ## ENSG00000273493   0.1061417     0.07628747  0.1250713  0.6099516    0.5418939          NA

Interactive visualization – shiny

Writing a shiny app

A simple directory with user interface (ui.R) and server (server.R) R scripts

Conclusion

Merits of Bioconductor for High-Throughput Sequence Analysis

Acknowledgements

BioC 2015 Annual Conference, Seattle, WA, 20-22 July.

Key references

sessionInfo()

sessionInfo()
## R version 3.2.1 Patched (2015-06-19 r68553)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.2 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
##  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
## [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] shiny_0.12.0                            ggplot2_1.0.1                          
##  [3] airway_0.102.0                          RNAseqData.HNRNPC.bam.chr14_0.6.0      
##  [5] Homo.sapiens_1.1.2                      TxDb.Hsapiens.UCSC.hg19.knownGene_3.1.2
##  [7] org.Hs.eg.db_3.1.2                      GO.db_3.1.2                            
##  [9] RSQLite_1.0.0                           DBI_0.3.1                              
## [11] OrganismDbi_1.10.0                      GenomicFeatures_1.20.1                 
## [13] AnnotationDbi_1.30.1                    Biobase_2.28.0                         
## [15] GenomicFiles_1.4.0                      BiocParallel_1.2.2                     
## [17] rtracklayer_1.28.4                      GenomicAlignments_1.4.1                
## [19] Rsamtools_1.20.4                        DESeq2_1.8.1                           
## [21] RcppArmadillo_0.5.200.1.0               Rcpp_0.11.6                            
## [23] GenomicRanges_1.20.5                    GenomeInfoDb_1.4.0                     
## [25] Biostrings_2.36.1                       XVector_0.8.0                          
## [27] IRanges_2.2.4                           S4Vectors_0.6.0                        
## [29] BiocGenerics_0.14.0                     AnnotationHub_2.0.2                    
## [31] BiocStyle_1.6.0                         BiocInstaller_1.18.3                   
## 
## loaded via a namespace (and not attached):
##  [1] httr_0.6.1                   splines_3.2.1                Formula_1.2-1               
##  [4] interactiveDisplayBase_1.6.0 latticeExtra_0.6-26          RBGL_1.44.0                 
##  [7] yaml_2.1.13                  lattice_0.20-31              digest_0.6.8                
## [10] RColorBrewer_1.1-2           colorspace_1.2-6             htmltools_0.2.6             
## [13] httpuv_1.3.2                 plyr_1.8.2                   XML_3.98-1.2                
## [16] biomaRt_2.24.0               genefilter_1.50.0            zlibbioc_1.14.0             
## [19] xtable_1.7-4                 snow_0.3-13                  scales_0.2.4                
## [22] annotate_1.46.0              nnet_7.3-9                   proto_0.3-10                
## [25] survival_2.38-2              magrittr_1.5                 mime_0.3                    
## [28] evaluate_0.7                 MASS_7.3-41                  foreign_0.8-63              
## [31] graph_1.46.0                 tools_3.2.1                  formatR_1.2                 
## [34] stringr_1.0.0                munsell_0.4.2                locfit_1.5-9.1              
## [37] cluster_2.0.2                lambda.r_1.1.7               futile.logger_1.4.1         
## [40] grid_3.2.1                   RCurl_1.95-4.6               labeling_0.3                
## [43] bitops_1.0-6                 rmarkdown_0.6.1              codetools_0.2-11            
## [46] gtable_0.1.2                 reshape2_1.4.1               R6_2.0.1                    
## [49] gridExtra_0.9.1              knitr_1.10.5                 Hmisc_3.16-0                
## [52] futile.options_1.0.0         stringi_0.4-1                geneplotter_1.46.0          
## [55] rpart_4.1-9                  acepack_1.3-3.3