Cell Ranger3.1, printed on 12/17/2024
Human samples and mouse strains of types C57BL/6 and BALB/c have been tested. We provide primers and reference sequences that are known to work with these samples. If you use a different mouse strain, then it is possible that the primers and/or reference sequences will be insufficient. If you use another species, then you need to create your own primers and reference sequence. Note that defects in either could result in loss of clonotypes.
In general, spiking cell lines into your sample as a control may have unintended consequences. It may introduce high levels of background (particularly if the cell line cells are large or leaky), and it may confound the cell calling algorithm.
See the primary recommendation for V(D)J sequencing. This recommendation is repeated here. The following sections discuss the issues and choices available.
Read configuration | Recommended read pairs per cell |
---|---|
26 x 91 | 5000 |
Cell refers to a recovered, targeted cell. A recovered cell is a cell captured in a GEM. A targeted cell may be a T or B cell, depending on which enrichment primers are used.
When you run CellRanger, the number of recovered targeted cells is estimated based on sequence data, whereas before sequencing is carried out, your estimate may be based on assays having higher variability and/or guesswork.
One reason we recommend 26 x 91 read pairs is that it facilitates efficient sequencing of multiple library types including V(D)J and GEX in a single sequencing run.
For most samples, the recommended sequencing depth will approach the limit of what you would obtain at high depth, even using longer (150 x 150) reads. However there are exceptions. Sequencing saturation in the following plots is computed as:
number of productive pairs obtained using 26 x 91 read pairs at given depth
————————————————————————————————————————————————————————————————————————————
number of productive pairs obtained using 150 x 150 read pairs at high depth
Here high depth is ten-fold higher than the recommendation.
The plasma cell enriched sample is from a tumor, and in addition to having many plasma cells, may also have a significant fraction of dying cells.
Some samples do not saturate at the recommended sequencing depth. The most common cause for this is extreme variation in expression levels between cells in a sample, so that the high expression cells soak up most of the sequencing, leaving little for the low expression cells. For example this could happen if:
Samples for which overall expression levels are very low may also require greater sequencing depth to approach saturation.
Note that saturation may not be needed to achieve experimental goals. For example, obtaining data from many dying cells may or may not be useful.
Usually not. There are some exceptions, which depend on your sample type and the needs of your experiment:
If your sample is precious and it is more important to get data from all cells than to save on sequencing cost, use higher depth.
If recovery of low expression cells is critical to your experiment design, and your sample type is prone to high variation, use higher depth.
If large savings on sequencing cost are more important than modest increases to yield, use lower depth.
If you have a library which was sequenced across multiple flowcells, you can pool the reads from both sequencing runs. Follow the steps in Specifying Input FASTQs to combine them in a single cellranger vdj run.
We also support a 150 x 150 configuration:
Alternate read configuration | Recommended read pairs per cell |
---|---|
150 x 150 | 2000 |
This produces about the same number of read bases as the primary recommendation (26 x 91, 5000 read pairs per cell). The coverage response curves for these two configurations match closely.
Other read lengths may work but have not been tested. Coverage would need to be adjusted in the same fashion so that the number of read bases is about the same. Use of a second read that is significantly shorter than 91 bases may not work well.
CellRanger 3.1 increases yield across a wide range of datasets, and the the performance gain increases as coverage decreases. Here we show one example, noting that gains vary considerably from sample to sample:
Although use of 26 x 91 data was not supported by CellRanger 3.0, it is shown here for clarity of comparison. The same plot using 150 x 150 data for both code versions is highly similar.
There is no aggr function for V(D)J libraries at this time. However, if you have a single cell suspension which was run on multiple chip channels (GEM wells), producing multiple V(D)J libraries, you can analyze all of the libraries at once. The workflow requires writing a custom configuration file for the pipeline, as specified in Multi-Library Samples.