1 Basics

1.1 Install derfinder

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. derfinder is a R package available via the Bioconductor repository for packages. R can be installed on any operating system from CRAN after which you can install derfinder by using the following commands in your R session:

## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("derfinder")

## Check that you have a valid Bioconductor installation
biocValid()

1.2 Required knowledge

derfinder is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with RNA-seq data. That is, packages like Rsamtools, GenomicAlignments and rtracklayer that allow you to import the data. A derfinder user is not expected to deal with those packages directly but will need to be familiar with GenomicRanges to understand the results derfinder generates. It might also prove to be highly beneficial to check the BiocParallel package for performing parallel computations.

If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.

1.3 Asking for help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and Bioconductor have a steep learning curve so it is critical to learn where to ask for help. The blog post quoted above mentions some but we would like to highlight the Bioconductor support site as the main resource for getting help: remember to use the derfinder tag and check the older posts. Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.

We would like to highlight the derfinder user Jessica Hekman. She has used derfinder with non-human data, and in the process of doing so discovered some small bugs or sections of the documentation that were not clear.

1.4 Citing derfinder

We hope that derfinder will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation('derfinder')
## 
## Collado-Torres L, Nellore A, Frazee AC, Wilks C, Love MI, Langmead
## B, Irizarry RA, Leek JT and Jaffe AE (2017). "Flexible expressed
## region analysis for RNA-seq with derfinder." _Nucl. Acids Res._.
## doi: 10.1093/nar/gkw852 (URL: http://doi.org/10.1093/nar/gkw852),
## <URL:
## http://nar.oxfordjournals.org/content/early/2016/09/29/nar.gkw852>.
## 
## Frazee AC, Sabunciyan S, Hansen KD, Irizarry RA and Leek JT (2014).
## "Differential expression analysis of RNA-seq data at single-base
## resolution." _Biostatistics_, *15 (3)*, pp. 413-426. doi:
## 10.1093/biostatistics/kxt053 (URL:
## http://doi.org/10.1093/biostatistics/kxt053), <URL:
## http://biostatistics.oxfordjournals.org/content/15/3/413.long>.
## 
## Collado-Torres L, Jaffe AE and Leek JT (2017). _derfinder:
## Annotation-agnostic differential expression analysis of RNA-seq data
## at base-pair resolution via the DER Finder approach_. doi:
## 10.18129/B9.bioc.derfinder (URL:
## http://doi.org/10.18129/B9.bioc.derfinder),
## https://github.com/lcolladotor/derfinder - R package version 1.12.0,
## <URL: http://www.bioconductor.org/packages/derfinder>.
## 
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.

2 Quick start to using to derfinder

Here is a very quick example of a DER Finder analysis. This analysis is explained in more detail later on in this document.

## Load libraries
library('derfinder')
library('derfinderData')
library('GenomicRanges')

## Determine the files to use and fix the names
files <- rawFiles(system.file('extdata', 'AMY', package = 'derfinderData'),
    samplepatt = 'bw', fileterm = NULL)
names(files) <- gsub('.bw', '', names(files))

## Load the data from disk -- On Windows you have to load data from Bam files
fullCov <- fullCoverage(files = files, chrs = 'chr21', verbose = FALSE)

## Get the region matrix of Expressed Regions (ERs)
regionMat <- regionMatrix(fullCov, cutoff = 30, L = 76, verbose = FALSE)

## Get pheno table
pheno <- subset(brainspanPheno, structure_acronym == 'AMY')

## Identify which ERs are differentially expressed, that is, find the DERs
library('DESeq2')

## Round matrix
counts <- round(regionMat$chr21$coverageMatrix)

## Round matrix and specify design
dse <- DESeqDataSetFromMatrix(counts, pheno, ~ group + gender)

## Perform DE analysis
dse <- DESeq(dse, test = 'LRT', reduced = ~ gender, fitType = 'local')

## Extract results
mcols(regionMat$chr21$regions) <- c(mcols(regionMat$chr21$regions), results(dse))

## Save info in an object with a shorter name
ers <- regionMat$chr21$regions
ers

3 Introduction

derfinder is an R package that implements the DER Finder approach (Frazee, Sabunci