
Overview
A major challenge for droplet-based single-cell sequencing technologies is distinguishing true cells from uninformative barcodes in datasets with disparate library sizes confounded by high technical noise (i.e. batch-specific ambient RNA). dropkick is a fully automated software tool for quality control and filtering of scRNA-seq data with a focus on excluding ambient barcodes and recovering real cells bordering the quality threshold.
Read our Genome Research paper for more information on the development and benchmarking of dropkick, and visit the GitHub repository to view the source code.
dropkick QC
dropkick provides a QC module for initial evaluation of global distributions that define barcode populations (real cells vs. empty droplets) and quantifies the batch-specific ambient gene profile.
You can run the dropkick.qc module from terminal for a quick look at the total UMI distribution and ambient gene profile, saved as *_qc.png:
dropkick qc path/to/counts.h5ad
The resulting quality control report plots a log-log curve of total counts and genes per ranked barcode along with the percentage of mitochondrial and ambient counts for each barcode (A). The ambient genes are determined by ranking genes by dropout rate in ascending order and taking the top highly-expressed transcripts (B).

dropkick filtering
The primary dropkick module performs cell identification through a weakly-supervised logistic regression model. dropkick establishes initial thresholds on predictive global heuristics using an automated gradient-descent method, then trains a gene-based model to assign confidence scores to all barcodes in the dataset. dropkick model coefficients are sparse and biologically informative, identifying a minimal number of gene features associated with empty droplets and low-quality cells.

dropkick can be run as a command line tool or interactively with the scanpy single-cell analysis suite.
Usage from command line:
dropkick run path/to/counts.h5ad
Installation
Installation via pip or from source requires a Fortran compiler (brew install gcc for Mac users). The only other pre-requisite for the Python environment prior to installation is the numpy package (pip install numpy).
Install from PyPI:
pip install dropkick
Or compile from source:
git clone https://github.com/KenLauLab/dropkick.git
cd dropkick
python setup.py install
Documentation
Full documentation is available at KenLauLab.github.io/dropkick.