Cell Ranger DNA1.0, printed on 11/12/2024
The cellranger-dna cnv pipeline outputs a single position-sorted and indexed BAM file. These files are primarily provided for use with a BAM visualization tool such as the Integrated Genome Viewer (IGV).
File | Records | Reference | Description |
---|---|---|---|
possorted_bam.bam |
Reads | User-specified reference | Barcode-corrected reads aligned to the user-specified reference, sorted by reference position. |
The following assumes basic familiarity with the BAM format. More details on the SAM/BAM standard are available online.
Chromium cellular and molecular barcode information for each read is stored as TAG fields:
Tag | Type | Description |
---|---|---|
CB | Z | Chromium cellular barcode sequence that is error-corrected and confirmed against a list of known-good barcode sequences. |
CR | Z | Chromium cellular barcode sequence as reported by the sequencer. |
CY | Z | Chromium cellular barcode read quality. Phred scores as reported by sequencer. |
BC | Z | Sample index read. |
QT | Z | Sample index read quality. Phred scores as reported by sequencer. |
GP | Z | Genome position. Note: this is an auxiliary tag used for the purpose of duplicate marking and is not intended for downstream use. We intend to deprecate this tag in subsequent versions. |
MP | Z | Genome position of mate-pair. Note: this is an auxiliary tag used for the purpose of duplicate marking and is not intended for downstream use. We intend to deprecate this tag in subsequent versions. |
DC | Z | Number of inferred PCR duplicates for this non-duplicate read. Note: this is an auxiliary tag used for the purpose of duplicate marking and is not intended for downstream use. We intend to deprecate this tag in subsequent versions. |
The cell barcode CB
tag includes a suffix "-1" that labels the GEMs from a single channel and we call a GEM group.
AGAATGGTCTGCAT-1
Cell Ranger DNA currently only supports libraries generated from a single GEM run and so the suffix is always -1
. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.