Cell Ranger7.1 (latest), printed on 11/23/2024
The per_samples_outs/
directory is produced after a successful execution of the multi pipeline and contains filtered data, i.e., data from cell-associated barcodes in this sample. These are the main outputs of interest.
Contents of the following folders located within the per_samples_outs/
directory are described here. Click on the folder name below or scroll down to learn more.
Refer to the count and vdj pages for a detailed explanation.
The count/
folder contains the results of 5' Single Cell Gene Expression analysis:
├── count ├── analysis │ ├── clustering │ ├── diffexp │ ├── pca │ ├── tsne │ └── umap ├── aggregate_barcodes.csv ├── feature_reference.csv ├── sample_cloupe.cloupe ├── sample_filtered_barcodes.csv ├── sample_filtered_feature_bc_matrix │ ├── barcodes.tsv.gz │ ├── features.tsv.gz │ └── matrix.mtx.gz ├── sample_filtered_feature_bc_matrix.h5 ├── sample_molecule_info.h5 ├── sample_alignments.bam └── sample_alignments.bam.bai
File/Folder | Description |
---|---|
analysis |
Folder containing the results of graph-based clusters and K-means clustering 2-10; differential gene expression analysis between clusters; and PCA, t-SNE, and UMAP dimensionality reduction. Learn more |
aggregate_barcodes.csv |
Contents from both antibody and antigen aggregate barcode algorithms. If Antibody and Antigen Capture Libraries are included, and a specific barcode has been determined to be both an antigen and an antibody aggregate, this file contains two lines for that barcode. The first line is the antibody UMI count and the second line is the antigen UMI count associated with that aggregate barcode. The library_type column distinguishes antibody vs. antigen aggregate barcodes. |
feature_reference.csv |
A copy of the input feature_reference.csv |
sample_cloupe.cloupe |
A Loupe Browser readable file. |
sample_filtered_barcodes.csv |
File containing a list of barcodes associated with aligned reads. The barcode sequence ends in a suffix with a dash separator followed by a number. The number denotes a GEM well, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. The number should be “1” across all barcodes when analyzing a sample from a single GEM well. The suffix-based preservation of GEM well information is especially useful when running cellranger aggr on multiple libraries generated from different GEM chip channels. |
sample_filtered_feature_bc_matrix |
Contains only detected cell-associated barcodes. Each element of the matrix is the number of UMIs associated with a feature (row) and a barcode (column). This file can be input into third-party packages and allows users to wrangle the barcode-feature matrix (e.g. to filter outlier cells, run dimensionality reduction, normalize gene expression). This file is similar to the filtered_feature_bc_matrix file described here |
sample_filtered_feature_bc_matrix.h5 |
Same information as sample_molecule_bc_matrix in H5 format. |
sample_molecule_info.h5 |
Contains per-molecule information for all molecules that contain a valid barcode and valid UMI and were assigned with high confidence to a gene or Feature Barcode. This file is a required input to run cellranger aggr . Learn more |
sample_alignments.bam |
Indexed BAM file containing position-sorted reads aligned to the genome and transcriptome, as well as unaligned reads. Learn more |
sample_alignments.bam.bai |
Companion file to the sample_alignment.bam that serves as an external index. In cases where the reference transcriptome is generated from a genome with very long chromosomes (>512 Mbp), Cell Ranger v7.0+ generates a sample_alignments.bam.csi index file instead. |
TCR with gamma-delta chains
The cellranger multi pipeline allows users to analyze TCR libraries enriched for gamma (TRG) and delta (TRD) chains. However gamma-delta analysis is not a supported workflow and algorithm performance cannot be guaranteed. TRG/D outputs are located in the |
The vdj_t/
and vdj_b/
folders contain the results of V(D)J immune profiling analysis for T cells and B cells, respectively. The output file names and file structure in these folders are identical, and are only described once:
|── vdj_b/t ├── airr_rearrangement.tsv ├── cell_barcodes.json ├── clonotypes.csv ├── concat_ref.bam ├── concat_ref.bam.bai ├── concat_ref.fasta ├── concat_ref.fasta.fai ├── consensus_annotations.csv ├── consensus.bam ├── consensus.bam.bai ├── consensus.fasta ├── consensus.fasta.fai ├── filtered_contig_annotations.csv ├── filtered_contig.fasta ├── filtered_contig.fastq ├── vdj_contig_info.pb └── vloupe.vloupe
File/Folder | Description |
---|---|
airr_rearrangement.tsv |
Annotated contigs and consensus sequences of V(D)J rearrangements in the AIRR format. Learn more |
cell_barcodes.json |
List of barcodes identified as T/B cells. |
clonotypes.csv |
High-level descriptions of each clonotype. Learn more |
concat_ref.bam |
For each clonotype consensus, each reference sequence is the annotated germline segments concatenated together. This file shows how both the per-cell contigs and the clonotype consensus contig relate to the germline reference. concat_ref.bam is expected to reveal polymorphisms, somatic mutations, and recombination-induced differences such as non-templated nucleotide additions. |
concat_ref.bam.bai |
Companion file to the concat_ref.bam that serves as an external index. |
concat_ref.fasta |
Concatenated V(D)J reference segments for the segments detected on each consensus sequence. These serve as an approximate reference for each consensus sequence. |
concat_ref.fasta.fai |
Companion file to the concat_ref.fasta that serves as an external index. |
consensus_annotations.csv |
High-level and detailed annotations of each clonotype consensus sequence. |
consensus.bam |
Each reference sequence is a clonotype consensus sequence, and each record is an alignment of a single cell's contig against this consensus. For a clonotype consensus sequence, this file shows how the constituent per-cell assemblies support the consensus. |
consensus.bam.bai |
Companion file to the consensus.bam that serves as an external index. |
consensus.fasta |
The clonotype consensus sequences is the consensus sequence of each assembled contig. It is identical to the sequence of the top (most frequent) exact subclonotype. The consensus sequence should be full-length (starting in the 5' UTR and ending at the C gene primer binding site). Poor data quality may result in partial sequence. |
consensus.fasta.fai |
Companion file to the consensus.fasta that serves as an external index. |
filtered_contig_annotations.csv |
High-level annotations of each high-confidence, cellular contig. This is a subset of all_contig_annotations.csv. Learn more |
filtered_contig.fasta |
High-confidence contig sequences in cell barcodes in FASTA format. |
filtered_contig.fastq |
High-confidence contig sequences in cell barcodes in FASTQ format. |
vdj_contig_info.pb |
This file stores the contig annotations, V(D)J reference and additional metadata in a protobuf binary file format. This file is required to run the cellranger aggr pipeline. Learn more |
vloupe.vloupe |
Loupe V(D)J Browser readable file. |
Folder containing the results of Antigen Capture analysis. Only present if an Antigen Capture library is included in the analysis. The two files in this folder are antigen_specificity_scores.csv
(if the [antigen-specificity]
section was provided in the multi config CSV) and per_barcode.csv
.
The primary outputs of the antigen specificity algorithm are located in the antigen_specificity_scores.csv
. The barcode
column shows all cell-associated barcodes, the antigen
and antigen_umi
columns show on-target antigen IDs and per barcode on-target antigen UMI counts, and the control
and control_umi
columns show the negative control antigen IDs and negative control antigen UMI counts. The antigen specificity score is calculated per barcode (described in the Antigen Algorithm page and reported in the antigen_specificity_scores
column. For a TCR Antigen Capture (BEAM-T) library, MHC allele ID is shown in the mhc_allele
column. If a given barcode is associated with a clonotype clonotype, the clonotype and exact sub-clonotype IDs are reported in the raw_clonotype_id
and exact_subclonotype_id
columns, respectively.
barcode,antigen,antigen_umi,control,control_umi,antigen_specificity_score,mhc_allele,raw_clonotype_id,exact_subclonotype_id AAACGGGAGCCCGAAA-1,BEAM01,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1 AAACGGGAGCCCGAAA-1,BEAM02,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1 AAACGGGAGCCCGAAA-1,BEAM03,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1 AAACGGGAGCCCGAAA-1,BEAM04,0,BEAM12,2,0.0,HLA-A*02:01,clonotype1,1
The per_barcode.csv
is a barcode lookup table to find barcodes that are called as Gene Expression or V(D)J cells. The is_gex_cell
column identifies barcodes called as cells based on the Gene Expression library, the is_vdj_cell
column identifies barcodes called as cells based on the V(D)J library, the raw_clonotype_id
column shows the clonotype ID assigned to that barcode (if one exists), and the exact_subclonotype_id
column shows the exact subclonotype ID assigned to that barcode (if one exists).
barcode,is_gex_cell,is_vdj_cell,raw_clonotype_id,exact_subclonotype_id AAACCTGAGGTAGCCA-1,true,true,clonotype1,1 AAACCTGAGTGCTGCC-1,true,true,clonotype1,1 AAACCTGCAATCCGAT-1,true,true,clonotype1,1 AAACCTGGTACCGGCT-1,true,true,clonotype1,1 AAACCTGTCAGTTAGC-1,true,true,clonotype1,1 AAACCTGTCCGAAGAG-1,true,false,, AAACGGGAGCTCAACT-1,true,true,clonotype1,1 AAACGGGCAGTAACGG-1,true,true,clonotype1,1 AAACGGGTCATAACCG-1,true,true,clonotype1,1 AAAGATGCAAGCGAGT-1,true,true,clonotype1,1 AAAGATGCACGGACAA-1,true,false,,