10x Genomics
Chromium Single Cell ATAC
Cell Ranger ATAC5.0, printed on 11/14/2024
Differences from Gene Expression
If you've used Loupe Browser before to analyze gene expression, you will find exploring ATAC
data familiar in some ways, and different in others. The Cell Ranger ATAC algorithm documentation
covers algorithms and analysis in more detail, but in short, here are some key things to keep in mind when looking at ATAC data:
- UMI count per cell is the unit of gene expression. Cut sites per cell is the unit of accessibility.
- Genes are the rows of a gene expression matrix. Peaks are the rows of a chromatin accessibility matrix.
- Peaks are genomic regions where there were significant upticks in fragment cut sites, which indicate regions of open chromatin. They are named by their location (e.g., "chr1:10244-10510")
- Unlike genes, peaks are likely to be different between different datasets.
- There are typically more distinct peaks in an ATAC dataset than there are genes in a reference.
- The dynamic range of gene expression per cell is typically much wider than the dynamic range of cut sites per peak per cell. This means that you will often use aggregate features (see below) to separate data.
- In addition to peaks, there are several aggregate feature types which can be also used to differentiate
cells:
- Promoter sums, which are the sums of cut sites per cell (within peaks) which are close to one of the
transcription start sites for that gene. These features are named "(Gene) Sum". Not all peaks
are associated with a gene.
- Transcription factor motifs, which are the sums of cut sites per cell which fall within peaks
associated with a motif by the Cell Ranger ATAC pipeline. Motif features are named after the
motifs themselves (e.g., "SPI1"). A peak is usually associated with multiple motifs.
- An ATAC dataset takes up several times as much disk space (per cell) than a gene expression dataset.
- To see fragment locations per cluster in high resolution, you need access to the
fragments.tsv.gz
file
for that run, generated by the Cell Ranger ATAC pipeline. These files are typically several times larger than
the .cloupe file, which is why they are not bundled. You can either specify the location of this file on a locally mounted file system, or on the web via a URL.