DOI: 10.18129/B9.bioc.BUSpaRse    

kallisto | bustools R utilities

Bioconductor version: Release (3.12)

The kallisto | bustools pipeline is a fast and modular set of tools to convert single cell RNA-seq reads in fastq files into gene count or transcript compatibility counts (TCC) matrices for downstream analysis. Central to this pipeline is the barcode, UMI, and set (BUS) file format. This package serves the following purposes: First, this package allows users to manipulate BUS format files as data frames in R and then convert them into gene count or TCC matrices. Furthermore, since R and Rcpp code is easier to handle than pure C++ code, users are encouraged to tweak the source code of this package to experiment with new uses of BUS format and different ways to convert the BUS file into gene count matrix. Second, this package can conveniently generate files required to generate gene count matrices for spliced and unspliced transcripts for RNA velocity. Here biotypes can be filtered and scaffolds and haplotypes can be removed, and the filtered transcriptome can be extracted and written to disk. Third, this package implements utility functions to get transcripts and associated genes required to convert BUS files to gene count matrices, to write the transcript to gene information in the format required by bustools, and to read output of bustools into R as sparses matrices.

Author: Lambda Moses [aut, cre] , Lior Pachter [aut, ths]

Maintainer: Lambda Moses <dlu2 at caltech.edu>

Citation (from within R, enter citation("BUSpaRse")):


To install this package, start R (version "4.0") and enter:

if (!requireNamespace("BiocManager", quietly = TRUE))


For older versions of R, please refer to the appropriate Bioconductor release.


To view documentation for the version of this package installed in your system, start R and enter:



HTML R Script Converting BUS format into sparse matrix
HTML R Script Transcript to gene
PDF   Reference Manual
Text   NEWS


biocViews RNASeq, SingleCell, Software, WorkflowStep
Version 1.3.1
In Bioconductor since BioC 3.10 (R-3.6) (1 year)
License BSD_2_clause + file LICENSE
Depends R (>= 3.6)
Imports AnnotationDbi, AnnotationFilter, biomaRt, BiocGenerics, Biostrings, BSgenome, dplyr, ensembldb, GenomeInfoDb, GenomicFeatures, GenomicRanges, ggplot2, IRanges, magrittr, Matrix, methods, plyranges, Rcpp, S4Vectors, stats, stringr, tibble, tidyr, utils, zeallot
LinkingTo Rcpp, RcppArmadillo, RcppProgress, BH
Suggests knitr, rmarkdown, testthat, BiocStyle, TENxBUSData, TxDb.Hsapiens.UCSC.hg38.knownGene, BSgenome.Hsapiens.UCSC.hg38, EnsDb.Hsapiens.v86
SystemRequirements GNU make
URL https://github.com/BUStools/BUSpaRse
BugReports https://github.com/BUStools/BUSpaRse/issues
Depends On Me
Imports Me
Suggests Me
Links To Me
Build Report  

Package Archives

Follow Installation instructions to use this package in your R session.

Source Package BUSpaRse_1.3.1.tar.gz
Windows Binary BUSpaRse_1.4.0.zip (32- & 64-bit)
macOS 10.13 (High Sierra) BUSpaRse_1.4.0.tgz
Source Repository git clone https://git.bioconductor.org/packages/BUSpaRse
Source Repository (Developer Access) git clone git@git.bioconductor.org:packages/BUSpaRse
Package Short Url https://bioconductor.org/packages/BUSpaRse/
Package Downloads Report Download Stats

Documentation »


R / CRAN packages and documentation

Support »

Please read the posting guide. Post questions about Bioconductor to one of the following locations: