1 Introduction

Expression quantitative trait loci (eQTL) analysis links variations in gene expression levels to genotypes. scQTLtools is designed to identify genetic variants that influence gene expression at the single-cell level, and can also visualize the results. Our package includes data preprocessing and multiple visualization options, providing researchers with a powerful tool for exploring sc-eQTLs within specific cellular contexts.

1.1 Rationale for Bioconductor Submission

By seeking inclusion in Bioconductor, we aim to integrate scQTLtools into a well-established ecosystem that is widely used by researchers in bioinformatics. Bioconductor’s rigorous standards for package quality and its focus on reproducibility will enhance the credibility of scQTLtools. Additionally, being part of Bioconductor will provide access to a broader user base and foster collaboration with other developers, contributing to the ongoing improvement and validation of the package.

2 Installation

if (!require("BiocManager"))
    install.packages("BiocManager")
BiocManager::install("scQTLtools")

3 Overview of the package

The functions in scQTLtools can be categorized into data input, data pre-process, sc-eQTL calling and visualization modules. The functions and their brief descriptions are summarized below.

3.1 General Workflow

Each module is summarized as shown below.

scQTLtools requires two key input data: a single-cell gene expression dataset and a corresponding SNP genotype matrix. The single-cell gene expression dataset can be either a gene expression matrix or an object such as a Seurat v4 object or a Bioconductor SingleCellExperiment object. The input genotype matrix should follow a 0/1/2/3 encoding scheme: 1 for homozygous reference genotype, 2 for homozygous alternative genotype, 3 for heterozygous genotype, and 0 for missing values. Moreover, scQTLtools can support a simplified 0/1/2 encoding scheme, where 2 denotes a non-reference genotype. Additionally, the package includes functionality to normalize the raw single-cell gene expression matrix, and filter SNP–gene pairs. After pre-process, scQTLtools implements the callQTL() function to identify sc-eQTLs. Moreover, visualization at the single-cell level demonstrates the specificity of eQTLs across distinct cell types or cellular states.

3.2 Comparison and advantages compared to similar works

We compared scQTLtools to other packages with similar functionality, including eQTLsingle, SCeQTL, MatrixEQTL, and iBMQ, as shown in the table below.