# 1 Motivation

This package is designed to:

• Replace the use of command-line utilities for most post-alignment processing, e.g. bedtools and deeptools
• Be easy-to-use and easy-to-install, without requiring external dependencies, e.g. hitslib or the kent source utilities from the UCSC genome browser
• Allow users to string together common analysis pipelines with simple, fast-running one-liners
• Avoid code repetition by providing tested and validated code
• Exploit the properties of basepair-resolution data to optimize performance and increase user-friendliness
• Use process forking to make use of multicore processors
• Maximize compatibility with Bioconductor’s rich ecosystem of analysis software, in addition to leveraging the traditional strengths of R in statistics and data visualization
• Fully replace the bigWig R package

# 2 Features

• Process and import bedGraph, bigWig, and bam files quickly and easily, with several pre-configured defaults for typical uses
• Count and filter spike-in reads
• Calculate spike-in normalization factors using several methods and options, including options for batch normalization
• Count reads by regions of interest
• Count reads at positions within regions of interest, at single-base resolution or in larger bins, and generate count matrices for heatmapping
• Calculate bootstrapped signal (e.g. readcount) profiles with confidence intervals (i.e. meta-profiles)
• Modify gene regions (e.g. extract promoters or genebody regions) using a single simple and straightforward function
• Conveniently and efficiently call DESeq2 to calculate differential expression in a manner that is robust to global changes1 Avoid the default behavior of calculating genewise dispersion across all samples present, which is invalid if any experimental condition causes broad changes
• Use non-contiguous genes in DESeq2 analysis, e.g. to exclude of specific sites/peaks from the analysis (not usually supported by DESeq2)
• Efficiently generate results across a list of comparisons
• Support for blacklisting throughout, and proper accounting of blacklisted sites in relevant calculations
• Users interact with an intuitive and computationally efficient data structure (the “basepair resolution GRanges” object), which is already supported by a rich, user-friendly suite of tools that greatly simplify working with datasets and annotations

# 3 Coming Soon

Data processing:

• Summarizing and plotting replicate correlations
• Function to use random read sampling to assess if sequencing depth sufficient to stabilize arbitrary calculations (so a user can supply anonymous function to calculate things like rank expression, power analysis or differential expression by DESeq2, pausing indices, etc.)

Signal counting and analysis:

• Two-stranded meta-profile calculations
• Automated generation of a list of DESeq2 comparisons using all possible combinations; all possible permutations; or by defining a simple hierarchy of each-vs-one comparisons