Cell Ranger2.2, printed on 11/14/2024
Cell Ranger for V(D)J is a set of analysis pipelines that process Chromium single cell 5′ RNA-seq output to assemble, quantify, and annotate paired V(D)J transcript sequences.
Cell Ranger includes two pipelines specific to V(D)J analysis, though
integrated experiments may also make use of pipelines for gene expression,
especially cellranger count
.
cellranger mkfastq demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files. It is a wrapper around Illumina's bcl2fastq, with additional useful features that are specific to 10x libraries and a simplified sample sheet format.
cellranger vdj takes FASTQ files from cellranger mkfastq and performs V(D)J sequence assembly and paired clonotype calling. It uses the Chromium cellular barcodes and UMIs to assemble V(D)J transcripts cell-by-cell. cellranger can take input from multiple sequencing runs on the same library.
cellranger count can, as of version 2.1 or greater, perform gene expression analysis on 5′ sequencing data. See Single Cell V(D)J + 5′ Gene Expression for more details.
Output is delivered in standard BAM, CSV, FASTA, FASTQ, JSON and HTML formats that are augmented with cell and clonotype-specific information.
Throughout the documentation, you will see references to samples, libraries, and sequencing runs. We define these as follows:
The relationship between these terms can be complex:
The Cell Ranger workflow always starts with running cellranger mkfastq on each flowcell directory, as described in Generating FASTQs. The subsequent steps vary depending on how many samples, libraries and flowcells you have. We will describe them in order of increasing complexity:
This is the most basic case. You have a single biological sample, which was prepared into a single library, and then sequenced on a single flowcell. Assuming the FASTQs have been generated with cellranger mkfastq, you just need to run cellranger vdj as described in V(D)J T Cell and B Cell Analysis.
If you have a library which was sequenced across multiple flowcells, you can pool the reads from both sequencing runs. Follow the steps in Specifying Input FASTQs to combine them in a single cellranger vdj run.
If you have a single sample which was run on multiple chip channels, producing multiple libraries, you can analyze all of the libraries at once. The workflow requires writing a custom configuration file for the pipeline, as specified in Multi-Library Samples.
5′ gene expression libraries and V(D)J enriched libraries generated from the same cDNA product must be processed by cellranger count and cellranger vdj respectively. Refer to Single Cell V(D)J + 5′ Gene Expression for more information.