BiocIntro 0.0.8
Abstract: This two-hour workshop is meant to empower Cancer Moonshot research labs to tackle their bioinformatic analysis challenges. For the first hour and a half, we’ll work through a hands-on workflow for single cell assay analysis. We’ll introduce data import, management, and interactive visualization using Bioconductor tools like iSEE. After seeing how to work with one assay, we’ll briefly explore approaches to integrating different assays. In the final ½ hour we’ll go beyond Bioconductor tools for single-cell analysis. We’ll assemble a panel to discuss possible strategies for data analysis challenges submitted (before the workshop) to the organizers.
Goal: Empower Cancer Moonshot Research Labs to tackle their bioinformatic analysis challenges
Objectives, this workshop:
What we will learn
R
Vectors, variables, and functions
x = rnorm(100)
mean(x)
## [1] -0.1471021
var(x)
## [1] 0.7680572
hist(x)
Manageing data: classes and methods
y = x + rnorm(100)
df = data.frame(x, y)
plot(y ~ x, df)
Visualization
fit = lm(y ~ x, df)
anova(fit)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 69.876 69.876 59.719 9.617e-12 ***
## Residuals 98 114.668 1.170
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(y ~ x, df)
abline(fit)
Extending base R: packages
library(ggplot2)
ggplot(df, aes(x, y)) +
geom_point() +
geom_smooth(method="lm")
CRAN (Comprehensive R Archive Network)
Help!
?lm
browseVignettes("ggplot2")
/ vignette(package="ggplot2")
What we learned
data.frame
help manage data?
, browseVignettes()
and other meansWhat we will learn
SummarizedExperiment
for data managementBioconductor
Resources
Data management
Domain-specific work flows, e.g., bulk RNA-seq diffrential expression
library(airway)
data(airway) # load example data
airway
## class: RangedSummarizedExperiment
## dim: 64102 8
## metadata(1): ''
## assays(1): counts
## rownames(64102): ENSG00000000003 ENSG00000000005 ... LRG_98 LRG_99
## rowData names(0):
## colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
## colData names(9): SampleName cell ... Sample BioSample