This is a list of the last 100 packages added to Bioconductor and available in the development version of Bioconductor. The list is also available as an RSS Feed.

alpine alpine

Fragment sequence bias modeling and correction for RNA-seq transcript abundance estimation.

GAprediction Prediction of gestational age with Illumina HumanMethylation450 data

[GAprediction] predicts gestational age using Illumina HumanMethylation450 CpG data.

SVAPLSseq SVAPLSseq-An R package to adjust for the hidden factors of variability in differential gene expression studies based on RNAseq data

The package contains functions that are intended for the identification of differentially expressed genes between two groups of samples from RNAseq data after adjusting for various hidden biological and technical factors of variability.

CancerSubtypes Cancer subtypes identification, validation and visualization based on genomic data

CancerSubtypes integrates the current common computational biology methods for cancer subtypes identification and provides a standardized framework for cancer subtype analysis based on the genomic datasets.

GRmetrics Calculate growth-rate inhibition (GR) metrics

Functions for calculating and visualizing growth-rate inhibition (GR) metrics.

switchde Switch-like differential expression across single-cell trajectories

Inference and detection of switch-like differential expression across single-cell RNA-seq trajectories.

IPO Automated Optimization of XCMS Data Processing parameters

The outcome of XCMS data processing strongly depends on the parameter settings. IPO (`Isotopologue Parameter Optimization`) is a parameter optimization tool that is applicable for different kinds of samples and liquid chromatography coupled to high resolution mass spectrometry devices, fast and free of labeling steps. IPO uses natural, stable 13C isotopes to calculate a peak picking score. Retention time correction is optimized by minimizing the relative retention time differences within features and grouping parameters are optimized by maximizing the number of features showing exactly one peak from each injection of a pooled sample. The different parameter settings are achieved by design of experiment. The resulting scores are evaluated using response surface models.

CVE Cancer Variant Explorer

Shiny app for interactive variant prioritisation in precision cancer medicine. The input file for CVE is the output file of the recently released Oncotator Variant Annotation tool summarising variant-centric information from 14 different publicly available resources relevant for cancer researches. Interactive priortisation in CVE is based on known germline and cancer variants, DNA repair genes and functional prediction scores. An optional feature of CVE is the exploration of the tumour-specific pathway context that is facilitated using co-expression modules generated from publicly available transcriptome data. Finally druggability of prioritised variants is assessed using the Drug Gene Interaction Database (DGIdb).

BayesKnockdown BayesKnockdown: Posterior Probabilities for Edges from Knockdown Data

A simple, fast Bayesian method for computing posterior probabilities for relationships between a single predictor variable and multiple potential outcome variables, incorporating prior probabilities of relationships. In the context of knockdown experiments, the predictor variable is the knocked-down gene, while the other genes are potential targets. Can also be used for differential expression/2-class data.

recount Explore and download data from the recount project

Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://biorxiv.org/content/early/2016/08/08/068478.

ASpli Analysis of alternative splicing using RNA-Seq

Integrative pipeline for the analyisis of alternative splicing using RNAseq.

ctsGE Clustering of Time Series Gene Expression data

Methodology for supervised clustering of potentially many predictor variables, such as genes etc., in time series datasets Provides functions that help the user assigning genes to predefined set of model profiles.

MODA MODA: MOdule Differential Analysis for weighted gene co-expression network

MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes.

SNPediaR Query data from SNPedia

SNPediaR provides some tools for downloading and parsing data from the SNPedia web site . The implemented functions allow users to import the wiki text available in SNPedia pages and to extract the most relevant information out of them. If some information in the downloaded pages is not automatically processed by the library functions, users can easily implement their own parsers to access it in an efficient way.

GeneGeneInteR Tools for Testing Gene-Gene Interaction at the Gene Level

The aim of this package is to propose several methods for testing gene-gene interaction in case-control association studies. Such a test can be done by aggregating SNP-SNP interaction tests performed at the SNP level (SSI) or by using gene-gene multidimensionnal methods (GGI) methods. The package also proposes tools for a graphic display of the results.

ImpulseDE Detection of DE genes in time series data using impulse models

ImpulseDE is suited to capture single impulse-like patterns in high throughput time series datasets. By fitting a representative impulse model to each gene, it reports differentially expressed genes whether across time points in a single experiment or between two time courses from two experiments. To optimize the running time, the code makes use of clustering steps and multi-threading.

RCAS RNA Centric Annotation System

RNA Centric Annotation System is an automated system that provides dynamic annotations for custom input files that contain transcriptomic target regions. Such transcriptomic target regions could be, for instance, peak regions detected by CLIP-Seq analysis that detect protein-RNA interactions, MeRIP-Seq analysis that detect RNA modifications (alias the epitranscriptome), or any collection of target regions at the level of the transcriptome. RCAS contains wrapper functions to do de novo motif discovery using functions from the motifRG package. RCAS overlays the input target regions with the annotated protein- coding genes and calculates the Gene Ontology (GO) terms that may be enriched or depleted in the input target regions compared to the background list of protein- coding genes. A Classical Fisher's Exact Test is applied for each GO term and the p-values obtained for each GO term is corrected for multiple testing using both the False Discovery Rate and the Family-Wise Error Rate. Similarly to the GO term enrichment analysis, RCAS also detects sets of genes as annotated in the Molecular Signatures Database that are enriched or depleted in the queried target regions. Results are corrected for multiple-testing according to both the False Discovery Rate and the Family-Wise Error Rate.

ASAFE Ancestry Specific Allele Frequency Estimation

Given admixed individuals' bi-allelic SNP genotypes and ancestry pairs (where each ancestry can take one of three values) for multiple SNPs, perform an EM algorithm to deal with the fact that SNP genotypes are unphased with respect to ancestry pairs, in order to estimate ancestry-specific allele frequencies for all SNPs.

maftools Summarize, Analyze and Visualize MAF files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

rDGIdb R Wrapper for DGIdb

The rDGIdb package provides a wrapper for the Drug Gene Interaction Database (DGIdb). For simplicity, the wrapper query function and output resembles the user interface and results format provided on the DGIdb website (http://dgidb.genome.wustl.edu/).

methylKit DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Perl is needed to read SAM files only.

DeepBlueR DeepBlueR

Accessing the DeepBlue Epigenetics Data Server through R.

LOBSTAHS Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences

LOBSTAHS is a multifunction package for screening, annotation, and putative identification of mass spectral features in large, HPLC-MS lipid datasets. In silico data for a wide range of lipids, oxidized lipids, and oxylipins can be generated from user-supplied structural criteria with a database generation function. LOBSTAHS then applies these databases to assign putative compound identities to features in any high-mass accuracy dataset that has been processed using xcms and CAMERA. Users can then apply a series of orthogonal screening criteria based on adduct ion formation patterns, chromatographic retention time, and other properties, to evaluate and assign confidence scores to this list of preliminary assignments. During the screening routine, LOBSTAHS rejects assignments that do not meet the specified criteria, identifies potential isomers and isobars, and assigns a variety of annotation codes to assist the user in evaluating the accuracy of each assignment.

AMOUNTAIN Active modules for multilayer weighted gene co-expression networks: a continuous optimization approach

A pure data-driven gene network, weighted gene co-expression network (WGCN) could be constructed only from expression profile. Different layers in such networks may represent different time points, multiple conditions or various species. AMOUNTAIN aims to search active modules in multi-layer WGCN using a continuous optimization approach.

clusterExperiment Compare clusterings for single-cell sequencing

This package provides functions for running and comparing many different clusterings of single-cell sequencing data.

MGFR Marker Gene Finder in RNA-seq data

The package is designed to detect marker genes from RNA-seq data.

CytoML GatingML interface for openCyto

This package is designed to use GatingML2.0 as the standard format to exchange the gated data with other software platform.

fgsea Fast Gene Set Enrichment Analysis

The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.

qsea IP-seq data analysis and vizualization

qsea (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides several functionalities for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.

ccmap Combination Connectivity Mapping

Finds drugs and drug combinations that are predicted to reverse or mimic gene expression signatures. These drugs might reverse diseases or mimic healthy lifestyles.

normr Normalization and difference calling in ChIP-seq data

Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.

chromstaR Combinatorial and Differential Chromatin State Analysis for ChIP-Seq Data

This package implements functions for combinatorial and differential analysis of ChIP-seq data. It includes uni- and multivariate peak-calling, export to genome browser viewable files, and functions for enrichment analyses.

esetVis Visualizations of expressionSet Bioconductor object

Utility functions for visualization of expressionSet (or SummarizedExperiment) Bioconductor object, including spectral map, tsne and linear discriminant analysis. Static plot via the ggplot2 package or interactive via the ggvis or rbokeh packages are available.

MetCirc MetCirc - a workflow to analyse and visualise metabolomics data

MetCirc comprises a workflow to interactively explore metabolomics data: create MSP, bin m/z values, calculate similarity between precursors and visualise similarities.

crossmeta Cross Platform Meta-Analysis of Microarrays

Implements cross platform and species meta-analyses of Affymentrix, Illumina, and Agilent microarray data. This package automates common tasks such as downloading, normalizing, and annotating raw GEO data. A user interface makes it easy to select control and treatment samples for each contrast and study. This input is used for subsequent surrogate variable analysis (models unaccounted sources of variation) and differential expression analysis. Effect sizes from differential expression analyses can be combined. This meta-analysis can asses genes measured in only a subset of studies.

sights Statistics and dIagnostic Graphs for HTS

SIGHTS is a suite of normalization methods, statistical tests, and diagnostic graphical tools for high throughput screening (HTS) assays. HTS assays use microtitre plates to screen large libraries of compounds for their biological, chemical, or biochemical activity.

geneplast Evolutionary and plasticity analysis based on orthologous groups distribution

Geneplast is designed for evolutionary and plasticity analysis based on orthologous groups distribution in a given species tree. It uses Shannon information theory and orthologs abundance to estimate the Evolutionary Plasticity Index. Additionally, it implements the Bridge algorithm to determine the evolutionary root of a given gene based on its orthologs distribution.

covEB Empirical Bayes estimate of block diagonal covariance matrices

Using bayesian methods to estimate correlation matrices assuming that they can be written and estimated as block diagonal matrices. These block diagonal matrices are determined using shrinkage parameters that values below this parameter to zero.

MADSEQ Mosaic Aneuploidy Detection and Quantification using Massive Parallel Sequencing Data

The MADSEQ package provides a group of hierarchical Bayeisan models for the detection of mosaic aneuploidy, the inference of the type of aneuploidy and also for the quantification of the fraction of aneuploid cells in the sample.

GEM GEM: fast association study for the interplay of Gene, Environment and Methylation

Tools for analyzing EWAS, methQTL and GxE genome widely.

NetCRG NetCRG An R package for studying cooperative re-sponse genes from transcriptomics data

NetCRG is acronym of Network of Cooperation Response Gene.

FunChIP Clustering and Alignment of ChIP-Seq peaks based on their shapes

Preprocessing and smoothing of ChIP-Seq peaks and efficient implementation of the k-mean alignment algorithm to classify them.

netprioR A model for network-based prioritisation of genes

A model for semi-supervised prioritisation of genes integrating network data, phenotypes and additional prior knowledge about TP and TN gene labels from the literature or experts.

Pigengene Infers biological signatures from gene expression data

Pigengene package provides an efficient way to infer biological signatures from gene expression profiles. The signatures are independent from the underlying platform, e.g., the input can be microarray or RNA Seq data. It can even infer the signatures using data from one platform, and evaluate them on the other. Pigengene identifies the modules (clusters) of highly coexpressed genes using coexpression network analysis, summarizes the biological information of each module in an eigengene, learns a Bayesian network that models the probabilistic dependencies between modules, and builds a decision tree based on the expression of eigengenes.

bioCancer Interactive Multi-Omics Cancers Data Visualization and Analysis

bioCancer is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.

msPurity Performs assessments and predictions of MSMS precursor purity

Performs assessments and predictions of MS/MS precursor purity. Works for both LC-MS(/MS) and DI-MS(/MS) data. Also provides simple processing steps for DI-MS data

covRNA Multivariate Analysis of Transcriptomic Data

This package provides the analysis methods fourthcorner and RLQ analysis for large-scale transcriptomic data.

dSimer Integration of Disease Similarity Measures

dSimer is an R package which provides eight function-based methods for disease similarity calculation. These eight methods take advantage of diverse biological data which calculate disease similarity from different perspectives. The disease similarity matrix obtained from these eight methods can also be visualized by dSimer.

Director A dynamic visualization tool of multi-level data

Director is an R package designed to streamline the visualization of molecular effects in regulatory cascades. It utilizes the R package htmltools and a modified Sankey plugin of the JavaScript library D3 to provide a fast and easy, browser-enabled solution to discovering potentially interesting downstream effects of regulatory and/or co-expressed molecules. The diagrams are robust, interactive, and packaged as highly-portable HTML files that eliminate the need for third-party software to view. This enables a straightforward approach for scientists to interpret the data produced, and bioinformatics developers an alternative means to present relevant data.

epivizrStandalone Run Epiviz Interactive Genomic Data Visualization App within R

This package imports the epiviz visualization JavaScript app for genomic data interactive visualization. The 'epivizrServer' package is used to provide a web server running completely within R. This standalone version allows to browse arbitrary genomes through genome annotations provided by Bioconductor packages.

EGAD Extending guilt by association by degree

The package implements a series of highly efficient tools to calculate functional properties of networks based on guilt by association methods.

ImmuneSpaceR A Thin Wrapper around the ImmuneSpace Database

Provides a convenient API for accessing data sets within ImmuneSpace (www.immunespace.org), the data repository and analysis platform of the Human Immunology Project Consortium (HIPC).

MMDiff2 Statistical Testing for ChIP-Seq data sets

This package detects statistically significant differences between read enrichment profiles in different ChIP-Seq samples. To take advantage of shape differences it uses Kernel methods (Maximum Mean Discrepancy, MMD).

EGSEA Ensemble of Gene Set Enrichment Analyses

This package implements the Ensemble of Gene Set Enrichment Analyses (EGSEA) method for gene set testing.

CHRONOS CHRONOS: A time-varying method for microRNA-mediated sub-pathway enrichment analysis

A package used for efficient unraveling of the inherent dynamic properties of pathways. MicroRNA-mediated subpathway topologies are extracted and evaluated by exploiting the temporal transition and the fold change activity of the linked genes/microRNAs.

diffloop Differential DNA loop calling from ChIA-PET data

A suite of tools for subsetting, visualizing, annotating, and statistically analyzing the results of one or more ChIA-PET experiments.

epivizrData Data Management API for epiviz interactive visualization app

Serve data from Bioconductor Objects through a WebSocket connection.

epivizrServer WebSocket server infrastructure for epivizr apps and packages

This package provides objects to manage WebSocket connections to epiviz apps. Other epivizr package use this infrastructure.

Harman The removal of batch effects from datasets using a PCA and constrained optimisation based technique

Harman is a PCA and constrained optimisation based technique that maximises the removal of batch effects from datasets, with the constraint that the probability of overcorrection (i.e. removing genuine biological signal along with batch noise) is kept to a fraction which is set by the end-user.

MultiDataSet Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and MethylationSet

Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and MethylationSet. MultiDataSet is designed for integrating multi omics data sets and MethylationSet to contain normalized methylation data. These package contains base classes for MEAL and rexposome packages.

PureCN Copy number calling and SNV classification using targeted short read sequencing

This package estimates tumor purity, copy number, loss of heterozygosity (LOH), and status of single nucleotide variants (SNVs). PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection pipelines, and has support for tumor samples without matching normal samples.

ClusterSignificance The ClusterSignificance package provides tools to assess if clusters have a separation different from random or permuted data

The ClusterSignificance package provides tools to assess if clusters have a separation different from random or permuted data. ClusterSignificance investigates clusters of two or more groups by first, projecting all points onto a one dimensional line. Cluster separations are then scored and the probability of the seen separation being due to chance is evaluated using a permutation method.

InteractionSet Base Classes for Storing Genomic Interaction Data

Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.

pbcmc Permutation-Based Confidence for Molecular Classification

The pbcmc package characterizes uncertainty assessment on gene expression classifiers, a. k. a. molecular signatures, based on a permutation test. In order to achieve this goal, synthetic simulated subjects are obtained by permutations of gene labels. Then, each synthetic subject is tested against the corresponding subtype classifier to build the null distribution. Thus, classification confidence measurement can be provided for each subject, to assist physician therapy choice. At present, it is only available for PAM50 implementation in genefu package but it can easily be extend to other molecular signatures.

LymphoSeq Analyze high-throughput sequencing of T and B cell receptors

This R package analyzes high-throughput sequencing of T and B cell receptor complementarity determining region 3 (CDR3) sequences generated by Adaptive Biotechnologies' ImmunoSEQ assay. Its input comes from tab-separated value (.tsv) files exported from the ImmunoSEQ analyzer.

genbankr Parsing GenBank files into semantically useful objects

Reads Genbank files.

BgeeDB Annotation and gene expression data from Bgee database

A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.

pqsfinder Identification of potential quadruplex forming sequences

The main functionality of the this package is to detect DNA sequence patterns that are likely to fold into an intramolecular G-quadruplex (G4). Unlike many other approaches, this package is able to detect sequences responsible for G4s folded from imperfect G-runs containing bulges or mismatches and as such is more sensitive than competing algorithms.

oppar Outlier profile and pathway analysis in R

The R implementation of mCOPA package published by Wang et al. (2012). Oppar provides methods for Cancer Outlier profile Analysis. Although initially developed to detect outlier genes in cancer studies, methods presented in oppar can be used for outlier profile analysis in general. In addition, tools are provided for gene set enrichment and pathway analysis.

pcaExplorer Interactive Visualization of RNA-seq Data Using a Principal Components Approach

This package provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.

BatchQC Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data, and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs, and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

scran Methods for Single-Cell RNA-Seq Data Analysis

Implements a variety of low-level analyses of single-cell RNA-seq data. Methods are provided for normalization of cell-specific biases, assignment of cell cycle phase, and detection of highly variable and significantly correlated genes.

Glimma Interactive HTML graphics

This package generates interactive visualisations for analysis of RNA-sequencing data using output from limma, edgeR or DESeq2 packages in an HTML page. The interactions are built on top of the popular static representations of analysis results in order to provide additional information.

odseq Outlier detection in multiple sequence alignments

Performs outlier detection of sequences in a multiple sequence alignment using bootstrap of predefined distance metrics. Outlier sequences can make downstream analyses unreliable or make the alignments less accurate while they are being constructed. This package implements the OD-seq algorithm proposed by Jehl et al (doi 10.1186/s12859-015-0702-1) for aligned sequences and a variant using string kernels for unaligned sequences.

CONFESS Cell OrderiNg by FluorEScence Signal

Single Cell Fluidigm Spot Detector.

Linnorm Linear model and normality based transformation method (Linnorm)

Linnorm is an R package for the analysis of RNA-seq, scRNA-seq, ChIP-seq count data or any large scale count data. It transforms such datasets for parametric tests. In addition to the transformtion function, the following pipelines are implemented: 1. Cell subpopluation analysis and visualization using PCA clustering, 2. Differential expression analysis or differential peak detection using limma, 3. Highly variable gene discovery and visualization, 4. Gene correlation network analysis and visualization. Linnorm can work with raw count, CPM, RPKM, FPKM and TPM. Additionally, Linnorm provides the RnaXSim function for the simulation of RNA-seq raw counts for the evaluation of differential expression analysis methods. RnaXSim can simulate RNA-seq dataset in Gamma, Log Normal, Negative Binomial or Poisson distributions.

BadRegionFinder BadRegionFinder: an R/Bioconductor package for identifying regions with bad coverage

BadRegionFinder is a package for identifying regions with a bad, acceptable and good coverage in sequence alignment data available as bam files. The whole genome may be considered as well as a set of target regions. Various visual and textual types of output are available.

EBSEA Exon Based Strategy for Expression Analysis of genes

Calculates differential expression of genes based on exon counts of genes obtained from RNA-seq sequencing data.

CINdex Chromosome Instability Index

The CINdex package addresses important area of high-throughput genomic analysis. It allows the automated processing and analysis of the experimental DNA copy number data generated by Affymetrix SNP 6.0 arrays or similar high throughput technologies. It calculates the chromosome instability (CIN) index that allows to quantitatively characterize genome-wide DNA copy number alterations as a measure of chromosomal instability. This package calculates not only overall genomic instability, but also instability in terms of copy number gains and losses separately at the chromosome and cytoband level.

QUBIC An R package for qualitative biclustering in support of gene co-expression analyses

The core function of this R package is to provide the implementation of the well-cited and well-reviewed QUBIC algorithm, aiming to deliver an effective and efficient biclustering capability. This package also includes the following related functions: (i) a qualitative representation of the input gene expression data, through a well-designed discretization way considering the underlying data property, which can be directly used in other biclustering programs; (ii) visualization of identified biclusters using heatmap in support of overall expression pattern analysis; (iii) bicluster-based co-expression network elucidation and visualization, where different correlation coefficient scores between a pair of genes are provided; and (iv) a generalize output format of biclusters and corresponding network can be freely downloaded so that a user can easily do following comprehensive functional enrichment analysis (e.g. DAVID) and advanced network visualization (e.g. Cytoscape).

isomiRs Analyze isomiRs and miRNAs from small RNA-seq

Characterization of miRNAs and isomiRs, clustering and differential expression.

GenoGAM A GAM based framework for analysis of ChIP-Seq data

This package allows statistical analysis of genome-wide data with smooth functions using generalized additive models based on the implementation from the R-package 'mgcv'. It provides methods for the statistical analysis of ChIP-Seq data including inference of protein occupancy, and pointwise and region-wise differential analysis. Estimation of dispersion and smoothing parameters is performed by cross-validation. Scaling of generalized additive model fitting to whole chromosomes is achieved by parallelization over overlapping genomic intervals.

MultiAssayExperiment Create Classes and Functions for Managing Multiple Assays on Sets of Samples

Develop an integrative environment where multiple assays are managed and preprocessed for genomic data analysis.

sscu Strength of Selected Codon Usage

The package can calculate the selection in codon usage in bacteria species. First and most important, the package can calculate the strength of selected codon usage bias (sscu) based on Paul Sharp's method. The method take into account of background mutation rate, and focus only on codons with universal translational advantages in all bacterial species. Thus the sscu index is comparable among different species. In addition, detainled optimal codons (selected codons) information can be calculated by optimal_codons function, so the users will have a more accurate selective scheme for each codons. Furthermore, we added one more function optimal_index in the package. The function has similar mathematical formula as s index, but focus on the estimates the amount of GC-ending optimal codon for the highly expressed genes in the four and six codon boxes. The function takes into account of background mutation rate, and it is comparable with the s index. However, since the set of GC-ending optimal codons are likely to be different among different species, the index can not be compared among different species.

DEFormats Differential gene expression data formats converter

Covert between different data formats used by differential gene expression analysis tools.

genphen A tool for quantification of genotype-phenotype associations using statistical learning techniques

Given a set of genetic polymorphisms in the form of single nucleotide polymorphisms or single amino acid polymorphisms and a corresponding phenotype data, often we are interested in quantifying their association such that we can identify the causal polymorphisms. Using statistical learning techniques such as random forests and support vector machines, this tool provides the means to quantify genotype-phenotype associations. It also provides visualization functions which enable the user to visually inspect the results of such genetic association study and conveniently select the genotypes which have the highest strength of association with the phenotype.

recoup An R package for the creation of complex genomic profile plots

recoup calculates and plots signal profiles created from short sequence reads derived from Next Generation Sequencing technologies. The profiles provided are either sumarized curve profiles or heatmap profiles. Currently, recoup supports genomic profile plots for reads derived from ChIP-Seq and RNA-Seq experiments. The package uses ggplot2 and ComplexHeatmap graphics facilities for curve and heatmap coverage profiles respectively.

AneuFinder Analysis of Copy Number Variation in Single-Cell-Sequencing Data

This package implements functions for CNV calling, plotting, export and analysis from whole-genome single cell sequencing data.

OncoScore A tool to identify potentially oncogenic genes

OncoScore is a tool to measure the association of genes to cancer based on citation frequency in biomedical literature. The score is evaluated from PubMed literature by dynamically updatable web queries.

CountClust Clustering and Visualizing RNA-Seq Expression Data using Grade of Membership Models

Fits grade of membership models (GoM, also known as admixture models) to cluster RNA-seq gene expression count data, identifies characteristic genes driving cluster memberships, and provides a visual summary of the cluster memberships.

ISoLDE Integrative Statistics of alleLe Dependent Expression

This package provides ISoLDE a new method for identifying imprinted genes. This method is dedicated to data arising from RNA sequencing technologies. The ISoLDE package implements original statistical methodology described in the publication below.

GMRP GWAS-based Mendelian Randomization and Path Analyses

Perform Mendelian randomization analysis of multiple SNPs to determine risk factors causing disease of study and to exclude confounding variabels and perform path analysis to construct path of risk factors to the disease.

debrowser debrowser: Interactive Differential Expresion Analysis Browser

Bioinformatics platform containing interactive plots and tables for differential gene and region expression studies. Allows visualizing expression data much more deeply in an interactive and faster way. By changing the parameters, user can easily discover different parts of the data that like never have been done before. Manually creating and looking these plots takes time. With this system users can prepare plots without writing any code. Differential expression, PCA and clustering analysis are made on site and the results are shown in various plots such as scatter, bar, box, volcano, ma plots and Heatmaps.

DRIMSeq Differential splicing and sQTL analyses with Dirichlet-multinomial model in RNA-Seq

The package provides two frameworks. One for the differential splicing analysis between different conditions and one for the sQTL analysis. Both are based on modeling the counts of genomic features (i.e., transcripts, exons or exonic bins) with Dirichlet-multinomial distribution. The package also makes available functions for visualization and exploration of the data and results.

SpidermiR SpidermiR: An R/Bioconductor package for integrative network analysis with miRNA data

The aims of SpidermiR are : i) facilitate the network open-access data retrieval from GeneMania data, ii) prepare the data using the appropriate gene nomenclature, iii) integration of miRNA data in a specific network, iv) provide different standard analyses and v) allow the user to visualize the results. In more detail, the package provides multiple methods for query, prepare and download network data (GeneMania), and the integration with validated and predicted miRNA data (mirWalk, miR2Disease,miRTar, miRandola,Pharmaco-miR,DIANA, Miranda, PicTar and TargetScan) and the use of standard analysis (igraph) and visualization methods (networkD3).

DNAshapeR High-throughput prediction of DNA shape features

DNAhapeR is an R/BioConductor package for ultra-fast, high-throughput predictions of DNA shape features. The package allows to predict, visualize and encode DNA shape features for statistical learning.

SwathXtend SWATH extended library generation and satistical data analysis

It contains utility functions for integrating spectral libraries for SWATH and statistical data analysis for SWATH generated data.

bacon Controlling bias and inflation in association studies using the empirical null distribution

Bacon can be used to remove inflation and bias often observed in epigenome- and transcriptome-wide association studies. To this end bacon constructs an empirical null distribution using a Gibbs Sampling algorithm by fitting a three-component normal mixture on z-scores.

PCAN Phenotype Consensus ANalysis (PCAN)

Phenotypes comparison based on a pathway consensus approach. Assess the relationship between candidate genes and a set of phenotypes based on additional genes related to the candidate (e.g. Pathways or network neighbors).

doppelgangR Identify likely duplicate samples from genomic or meta-data

The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).

Source Code & Build Reports »

Source code is stored in svn (user: readonly, pass: readonly).

Software packages are built and checked nightly. Build reports:

 

Development Version »

Bioconductor packages under development:

Developer Resources: