Cell Ranger2.2, printed on 12/17/2024
Cell Ranger's pipelines analyze sequencing data produced from Chromium single cell 5′ RNA-seq libraries. This involves the following steps:
Run cellranger mkfastq on the Illumina BCL output folder to generate FASTQ files.
Run cellranger vdj on FASTQ files produced by cellranger mkfastq.
For the following example, assume that the Illumina BCL output is in a folder named /sequencing/140101_D00123_0111_AHAWT7ADXX
.
First, follow the instructions on running cellranger mkfastq to generate FASTQ files. For example, if the flowcell serial number was HAWT7ADXX
, then cellranger mkfastq will output FASTQ files in HAWT7ADXX/outs/fastq_path
.
To generate single-cell V(D)J sequences and annotations for a single library, run cellranger vdj with the following arguments. For a complete list of command-line arguments, run cellranger vdj --help.
For help on which arguments to use to target a particular set of FASTQs, consult Running 10x Pipelines on FASTQ Files. |
Argument | Description |
---|---|
--id | A unique run ID string: e.g. sample345 |
--fastqs | Path of the FASTQ folder generated by cellranger mkfastq e.g. /home/jdoe/runs/HAWT7ADXX/outs/fastq_path
Can take multiple comma-separated paths, which is helpful if the same library was sequenced on multiple flowcells. Doing this will treat all reads from the library, across flowcells, as one sample. If you have multiple libraries for the sample, you will need to run cellranger vdj using a custom MRO file as detailed on the Multi-Flowcell Samples page. |
--reference | Path to the Cell Ranger V(D)J compatible reference e.g. /opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-2.0.0 . If --denovo is specified, this argument is optional. |
--sample | Sample name as specified in the sample sheet supplied to mkfastq .
Can take multiple comma-separated values, which is helpful if the sample was sequenced on multiple flowcells and the sample name used (and therefore fastq file prefix) is not identical between them. Doing this will treat all reads from the library, across flowcells, as one sample. If you have multiple libraries for the sample, you will need to run cellranger vdj using a custom MRO file as detailed on the Multi-Flowcell Samples page. |
--force-cells | (optional) Force pipeline to use this number of cells, bypassing the cell detection algorithm. Use this if the number of cells estimated by Cell Ranger is not consistent with the barcode rank plot. |
--denovo | (optional) Do not require that reads approximately align to the V(D)J reference before assembly. If specified, --reference becomes optional. This is useful for full de novo assembly without a V(D)J reference. NOTE: A larger set of contigs may be produced compared to the non-denovo mode. These may include V(D)J sequences with no homology to the reference. Fragments of highly abundant non-V(D)J transcripts may also be assembled. Careful downstream filtering and analysis of assembled sequences is recommended when using denovo mode. |
--chain | (optional) Force the web summary HTML and metrics summary CSV to only report on a particular chain type. The accepted values are:
|
--lanes | (optional) Lanes associated with this sample |
--localcores | (optional) Restricts cellranger to use specified number of cores to execute pipeline stages. By default, cellranger will use all of the cores available on your system. |
--localmem | (optional) Restricts cellranger to use specified amount of memory (in GB) to execute pipeline stages. By default, cellranger will use 90% of the memory available on your system. Please note that cellranger requires at least 16 GB of memory to run all pipeline stages. |
--indices | (Deprecated. Optional. Only used for output from cellranger demux) Sample indices associated with this sample. Comma-separated list of:
|
After determining these input arguments, run cellranger:
$ cd /home/jdoe/runs $ cellranger vdj --id=sample345 \ --reference=/opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-2.0.0 \ --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ --sample=mysample \
Following a set of preflight checks to validate input arguments, cellranger vdj pipeline stages will begin to run:
Martian Runtime - v2.3.3 Running preflight checks (please wait)... 2017-04-15 14:23:52 [runtime] (ready) ID.sample345.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS 2017-04-15 14:23:55 [runtime] (split_complete) ID.sample345.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS 2017-04-15 14:23:55 [runtime] (run:local) ID.sample345.SC_VDJ_ASSEMBLER_CS.SC_VDJ_ASSEMBLER.SETUP_CHUNKS.fork0.chnk0.main ...
By default, cellranger will use all of the cores available on your
system to execute pipeline stages. You can specify a different number of cores
to use with the --localcores
option; for example, --localcores=16
will limit cellranger to using up to sixteen cores at once. Similarly,
--localmem
will restrict the amount of memory (in GB) used by
cellranger.
The pipeline will create a new folder named with the sample ID you specified (e.g. /home/jdoe/runs/sample345
) for its output. If this folder already exists, cellranger will assume it is an existing pipestance and attempt to resume running it.
A successful cellranger vdj run should conclude with a message similar to this:
2017-04-15 14:32:18 [runtime] (join_complete) ID.sample345.SC_VDJ_ASSEMBLER_CS.VLOUPE_PREPROCESS Outputs: - Run summary HTML: /home/jdoe/runs/sample345/outs/web_summary.html - Run summary CSV: /home/jdoe/runs/sample345/outs/metrics_summary.csv - All-contig FASTA: /home/jdoe/runs/sample345/outs/all_contig.fasta - All-contig FASTA index: /home/jdoe/runs/sample345/outs/all_contig.fasta.fai - All-contig FASTQ: /home/jdoe/runs/sample345/outs/all_contig.fastq - Read-contig alignments: /home/jdoe/runs/sample345/outs/all_contig.bam - Read-contig alignment index: /home/jdoe/runs/sample345/outs/all_contig.bam.bai - All contig annotations (JSON): /home/jdoe/runs/sample345/outs/all_contig_annotations.json - All contig annotations (BED): /home/jdoe/runs/sample345/outs/all_contig_annotations.bed - All contig annotations (CSV): /home/jdoe/runs/sample345/outs/all_contig_annotations.csv - Filtered contig sequences FASTA: /home/jdoe/runs/sample345/outs/filtered_contig.fasta - Filtered contig sequences FASTQ: /home/jdoe/runs/sample345/outs/filtered_contig.fastq - Filtered contigs (CSV): /home/jdoe/runs/sample345/outs/filtered_contig_annotations.csv - Clonotype consensus FASTA: /home/jdoe/runs/sample345/outs/consensus.fasta - Clonotype consensus FASTA index: /home/jdoe/runs/sample345/outs/consensus.fasta.fai - Clonotype consensus FASTQ: /home/jdoe/runs/sample345/outs/consensus.fastq - Concatenated reference sequences: /home/jdoe/runs/sample345/outs/concat_ref.fasta - Concatenated reference index: /home/jdoe/runs/sample345/outs/concat_ref.fasta.fai - Contig-consensus alignments: /home/jdoe/runs/sample345/outs/consensus.bam - Contig-consensus alignment index: /home/jdoe/runs/sample345/outs/consensus.bam.bai - Contig-reference alignments: /home/jdoe/runs/sample345/outs/concat_ref.bam - Contig-reference alignment index: /home/jdoe/runs/sample345/outs/concat_ref.bam.bai - Clonotype consensus annotations (JSON): /home/jdoe/runs/sample345/outs/consensus_annotations.json - Clonotype consensus annotations (CSV): /home/jdoe/runs/sample345/outs/consensus_annotations.csv - Clonotype info: /home/jdoe/runs/sample345/outs/clonotypes.csv - Loupe V(D)J Browser file: /home/jdoe/runs/sample345/outs/vloupe.vloupe Pipestance completed successfully!
The output of the pipeline will be contained in a folder named with the sample ID you specified (e.g. sample345
). The subfolder named outs
will contain the main pipeline output files.
Once cellranger vdj has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .vloupe file in Loupe V(D)J Browser, or refer to the Understanding Output section to explore the data by hand.