CSAMA 2017: Statistical Data Analysis for Genome-Scale Biology

June 11-16, 2017
Bressanone-Brixen, Italy
URL: http://www.huber.embl.de/csama2017/

Lecturers: Jennifer Bryan, RStudio and UBC; Vincent J. Carey, Harvard Medical School; Laurent Gatto, University of Cambridge; Wolfgang Huber, European Molecular Biology Laboratory (EMBL), Heidelberg; Martin Morgan, Roswell Park Cancer Institute, Buffalo; Johannes Rainer, European Academy of Bozen (EURAC); Charlotte Soneson, University of Zurich; Levi Waldron, CUNY School of Public Health at Hunter College, New York.

Teaching Assistants: Simone Bell, EMBL, Heidelberg; Vladislav Kim, EMBL, Heidelberg; Lori Shepherd, RPCI, Buffalo; Mike L. Smith, EMBL, Heidelberg.

Resources

Source: Github

Monday, June 12

Lectures

  • Introduction to R and Bioconductor (html).
  • Computing with Sequences and Ranges (html).
  • Tabular data management (pdf).
  • Annotation resources (html); EnsemblDb (pdf).

Labs

Tuesday, June 13

Lectures

  • Basics of sequence alignment and aligners (pdf).
  • RNA-Seq data analysis and differential expression (pdf).
  • New workflows for RNA-seq (pdf).
  • Hypothesis testing (pdf).

Labs

  • End-to-end RNA-Seq workflow (html)
  • Independent hypothesis weighting (html)

Wednesday, June 14

Lectures

  • Multiple testing (txt).
  • Linear models (basic intro)) (html).
  • Experimental design, batch effects and confounding (pdf).
  • Robust statistics: median, MAD, rank test, Spearman, robust linear model (html).

Thursday, June 15

Lectures

  • Visualization, the grammar of graphics and ggplot2 (pdf).
  • Mass spec proteomics (html) and metabolomics (pdf).
  • Clustering and classification (html).
  • Resampling: cross-validation, bootstrap, and permutation tests (html).
  • Analysis of microbiome marker gene data (pdf).

Labs

  • Mass spec proteomics & metabolomics Proteomics (html), Metabolomics (html).
  • MultiAssayExperiment (html), cheatsheet (pdf).

Friday, June 16

Lectures

  • Gene set enrichment analysis (pdf).
  • Working with large-scale (pdf) and remote (html) data.
  • Developer Practices, Writing functions (html).
  • Developer Practices, Writing packages (html).

Labs

  • Graphics (pdf) (data: diabetes.csv).
  • Machine learning (supervised) (pdf).
  • Large data, performance, and parallelization; large-scale efficient computation with genomic intervals (html).