Mentored Projects

A mentored Bioconductor software development project is one in which experienced programmers work with volunteers to develop new capabilities needed by the community. <a href=#introduction>More…</a>

Projects Needing Volunteers

Projects in Progress

Get Help With Your Own Project

Contact Us

Completed Projects


A mentored Bioconductor software development project is one in which experienced programmers work with volunteers to develop new capabilities needed by the community.

Developers new to Bioconductor may find mentored projects a useful way to apply, refine and extend their skills. Projects are identified by experienced Bioconductor developers. The projects involve important but manageable programming tasks. Experienced developers act as mentors, providing guidance and oversight. Successful mentored projects will be incorporated into the appropriate packages, and contributors will receive full credit for their work. Users, contributors and mentors will all benefit.

We anticipate that mentored projects will usually be run by one or two experienced Bioconductor-savvy programmers who provide guidance, usually remotely, to one or more less-experienced programmers. All the tools of ‘social coding’ – from email and svn to github and skype – can be used, at the discretion of the participants. Except in unusual circumstances, we expect that participants will have their own independent funding, most likely as the result of a good fit between the mentored project and their current employment or academic studies.

Below you will find a list of proposed projects. We invite your participation. We welcome your suggestions.

Extending mzR

The mzR R/Bioconductor package provides a unified API to the common open and community-driven file formats and parsers available for mass spectrometry data, namely mzXML, mzML and mzData (see vignette for details). It uses C and C++ code from other third party open-source projects and heavily relies on the Rcpp package to, notably, provide a direct mapping from R to C++ infrastructure. Currently, mzR provides two actual backends to read Mass Spectrometry raw data:

  1. netCDF which reads, as the name implies, netCDF data
  2. RAMP to read mzData and mzXML via the ISB RAMP parser. This backend can also read mzML through the proteowizard RAMPadapter around the proteowizard infrastructure, but this interface is limited to the lowest common denominator between the mzXML/mzData/mzML formats.

This project is intended to add several related backends to mzR, by providing a direct wrapper around – and full access to – the proteowizard msdata object. The candidate will interact closely with Laurent Gatto and Steffen Neumann, and the proteowizard and Rcpp communities.

Project attributes and estimates:

[ Back to top ]

Expanding MotifDb

The MotifDb R/Bioconductor package provides unified access to (currently) seven transcription factor binding site motif collections, covering 22 organisms. We wish to expand its holdings by adding JASPAR 2014 and HOCOMOCO, and improving the annotation we offer for stamlab.

Project attributes and estimates:

[ Back to top ]

Galaxy-ification of Useful Scripts

From the Wikipedia entry for Galaxy:

Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.

The new Bioconductor RGalaxy simplifies the process of exposing an R function in Galaxy so that a user can run the function using nothing more than a web browser.

This project would involve taking an existing workflow (or conceiving a new workflow) and exposing it in Galaxy.

Project attributes and estimates:

[ Back to top ]

Add DEXSeq Functionality to easyRNASeq

The easyRNASeq package facilitates and expedites the processing and filtering of large RNA-seq datasets for subsequent analysis by Bioconductor packages edgeR and DESseq, which are concerned with gene-expression and alternative splicing, respectively. We propose to add an output format compatible with DEXSeq, a package for exon-level differential expression analysis.

Project attributes and estimates:

[ Back to top ]

Get Help With Your Package

Package authors sometimes have excellent statistical and bioinformatic ideas, but are not fully confident in their ability to produce a robust software package suitable for inclusion in Bioconductor. This mentored project pairs the package developer with an experienced programmer to produce quality software. Participants are expected to have a working version of their package, with the major ideas and preliminary implementation complete.

Project attributes and estimates:

[ Back to top ]

Take Over an ‘Orphaned’ Package

[ Back to top ]

Sometimes the maintainer of an older Bioconductor package is no longer able to perform that job. These older packages remain useful but occasionally need a bug fix or a small change. We are looking for volunteers to maintain such packages – which would otherwise be abandonded. Relatively little work is required, the original author will be available to answer questions, the Bioconductor core team can help, and the Bioconductor community will benefit.

Current orphans are listed below.

(No orphans at this time)


Please send mail to pshannon AT fhcrc DOT org if you would like to help out on any of these projects, or have an idea of your own which you wish to propose.

[ Back to top ]

Completed projects

Create an AnnotationDbi Package for PANTHER

We would like to see PANTHER annotation contained in a Bioconductor AnnotationDbi package.

PANTHER is found <a href=>here</a>, and summarized:

The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. "classifies genes by their function"

Project attributes and estimates:

[ Back to top ]

Add contructors for key classes in the graph package

The graph package was developed when users created objects with calls like new("graphNEL"), but there are advantages to hiding this level of implementation from the user and instead creating a new instance with graphNEL(). The project modernizes this aspects of the graph package.

[ Back to top ]

VCF Allele Frequencies

The VariantAnnotation package needed a function to compute genotype counts, allele frequencies and Hardy-Weinberg estimates from the genotype data in a VCF class.

Project attributes and estimates:

[ Back to top ]

VCF Probability-Based SNP Encoding of


MatrixToSnpMatrix() in the VariantAnnotation package converts the genotype data in a VCF object into a SnpMatrix object. Currently this is done without taking uncertain uncertain genotype calls into consideration. This project involves modifying MatrixToSnpMatrix() to use, when available, genotype uncertainty and likelihood information to convert genotypes to probability-based SnpMatrix encodings.

Project attributes and estimates:

[ Back to top ]

msGUI - an interactive mass spectrometry data browser

The aim of this project is to build a simple GUI to navigate raw mass spectrometry data files. Data input functionality and relevant data structures are available in the mzR and MSnbase packages. The final deliverable would be a new R package, that will be submitted to Bioconductor, implementing the GUI allowing users to directly browse raw data files as well as MSnExp raw data instances. The overall goal being to complement programmatic data access with interactive visualisation.

Project attributes and estimates:

[ Back to top ]

Source Code & Build Reports »

Source code is stored in svn (user: readonly, pass: readonly).

Software packages are built and checked nightly. Build reports:


Development Version »

Bioconductor packages under development:

Developer Resources:

Fred Hutchinson Cancer Research Center