CSAMA 2017: Statistical Data Analysis for Genome-Scale Biology
June 11-16, 2017
Bressanone-Brixen, Italy
URL: http://www.huber.embl.de/csama2017/
Lecturers: Jennifer Bryan, RStudio and UBC; Vincent J. Carey, Harvard
Medical School; Laurent Gatto, University of Cambridge; Wolfgang
Huber, European Molecular Biology Laboratory (EMBL), Heidelberg;
Martin Morgan, Roswell Park Cancer Institute, Buffalo; Johannes
Rainer, European Academy of Bozen (EURAC); Charlotte Soneson,
University of Zurich; Levi Waldron, CUNY School of Public Health at
Hunter College, New York.
Teaching Assistants: Simone Bell, EMBL, Heidelberg; Vladislav Kim,
EMBL, Heidelberg; Lori Shepherd, RPCI, Buffalo; Mike L. Smith, EMBL,
Heidelberg.
Resources
Source: Github
Monday, June 12
Lectures
- Introduction to R and Bioconductor
(html).
- Computing with Sequences and Ranges
(html).
- Tabular data management
(pdf).
- Annotation resources
(html);
EnsemblDb (pdf).
Labs
Tuesday, June 13
Lectures
- Basics of sequence alignment and aligners
(pdf).
- RNA-Seq data analysis and differential expression
(pdf).
- New workflows for RNA-seq
(pdf).
- Hypothesis testing
(pdf).
Labs
- End-to-end RNA-Seq workflow
(html)
- Independent hypothesis weighting
(html)
Wednesday, June 14
Lectures
- Multiple testing
(txt).
- Linear models (basic intro))
(html).
- Experimental design, batch effects and confounding
(pdf).
- Robust statistics: median, MAD, rank test, Spearman, robust linear model
(html).
Thursday, June 15
Lectures
- Visualization, the grammar of graphics and ggplot2
(pdf).
- Mass spec proteomics
(html)
and metabolomics
(pdf).
- Clustering and classification
(html).
- Resampling: cross-validation, bootstrap, and permutation tests
(html).
- Analysis of microbiome marker gene data
(pdf).
Labs
- Mass spec proteomics & metabolomics
Proteomics (html),
Metabolomics (html).
- MultiAssayExperiment
(html),
cheatsheet
(pdf).
Friday, June 16
Lectures
- Gene set enrichment analysis
(pdf).
- Working with large-scale
(pdf)
and remote
(html)
data.
- Developer Practices, Writing functions
(html).
- Developer Practices, Writing packages
(html).
Labs
- Graphics (pdf)
(data: diabetes.csv).
- Machine learning (supervised)
(pdf).
- Large data, performance, and parallelization; large-scale efficient
computation with genomic intervals
(html).