COTAN 2.1.8
library(COTAN)
library(zeallot)
library(data.table)
library(factoextra)
library(Rtsne)
library(qpdf)
library(GEOquery)
options(parallelly.fork.enable = TRUE)
This tutorial contains the same functionalities as the first release of the COTAN tutorial but done using the new and updated functions.
Download the data-set for "mouse cortex E17.5"
.
dataDir <- tempdir()
GEO <- "GSM2861514"
fName <- "GSM2861514_E175_Only_Cortical_Cells_DGE.txt.gz"
dataSetFile <- file.path(dataDir, GEO, fName)
if (!file.exists(dataSetFile)) {
getGEOSuppFiles(GEO, makeDirectory = TRUE,
baseDir = dataDir, fetch_files = TRUE,
filter_regex = fName)
sample.dataset <- read.csv(dataSetFile, sep = "\t", row.names = 1L)
}
Define a directory where the output will be stored.
outDir <- tempdir()
# Log-level 2 was chosen to showcase better how the package works
# In normal usage a level of 0 or 1 is more appropriate
setLoggingLevel(2L)
#> Setting new log level to 2
# This file will contain all the logs produced by the package
# as if at the highest logging level
setLoggingFile(file.path(outDir, "vignette_v2.log"))
#> Setting log file to be: /tmp/RtmpSMV3wE/vignette_v2.log
Initialize the COTAN
object with the row count table and
the metadata for the experiment.
cond <- "mouse_cortex_E17.5"
#cond <- "test"
#obj = COTAN(raw = sampled.dataset)
obj <- COTAN(raw = sample.dataset)
obj <- initializeMetaDataset(obj,
GEO = GEO,
sequencingMethod = "Drop_seq",
sampleCondition = cond)
#> Initializing `COTAN` meta-data
logThis(paste0("Condition ", getMetadataElement(obj, datasetTags()[["cond"]])),
logLevel = 1L)
#> Condition mouse_cortex_E17.5
Before we proceed to the analysis, we need to clean the data. The analysis will use a matrix of raw UMI counts as the input. To obtain this matrix, we have to remove any potential cell doublets or multiplets, as well as any low quality or dying cells.
We can check the library size (UMI number) with an empirical cumulative distribution function
ECDPlot(obj, yCut = 700L)
cellSizePlot(obj)
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
genesSizePlot(obj)
mit <- mitochondrialPercentagePlot(obj, genePrefix = "^Mt")
mit[["plot"]]