HOME  ›   pipelines
If your question is not answered here, please email us at:  ${email.software}

10x Genomics
Chromium Genome & Exome

Generating FASTQs with bcl2fastq

Though longranger demux is the preferred option for converting BCLs to Long Ranger--compatible FASTQs, there are a few cases where you may need to use Illumina's bcl2fastq directly to generate FASTQs. This will be necessary if:

You may also choose this method if bcl2fastq is more tightly integrated into your sequencing workflow.

Demultiplexing Chromium data with Illumina bcl2fastq requires the correct specification of the sample sheet and command-line options. This guide will walk you through what you'll need to do to generate Long Ranger-compatible FASTQs.

Sample Sheet Generator

You will need to create a sample sheet in order to get bcl2fastq to correctly embed the names of samples into output FASTQ files. There is a key difference to keep in mind when creating sample sheets for a Chromium run. Each Chromium sample index set is actually a blend of 4 different sequence oligos, and each oligo must be represented as a separate row in the sample sheet. This means that for every sample being demultiplexed from the flowcell, there should be 4 lines in the sample sheet.

The tool below will help you accurately generate data lines for your sample sheet. When you plan an experiment, you should know the name of the sample index set used for each sample, which comes from the reagent kit (such as "SI-P01-A2"). For each sample, enter its lane, sample name, and sample index set below, and then press 'Add'. When you're done, you can either copy and paste comma-separated output directly into a text editor to create a sample sheet CSV, or copy/paste tab-separated output into a spreadsheet such as Microsoft Excel:

Comma (CSV) Tab (for Spreadsheet copy)

If you are just running a single sample in a lane, then you can have a single line with the index blank, though bcl2fastq will include reads associated with any sample index.

Running bcl2fastq

Illumina bcl2fastq must be called with the correct --use-bases-mask argument and other arguments in order to properly demultiplex and output FASTQs for all the reads in a Chromium library. In the examples below, ${FLOWCELL_DIR} is the directory that contains a flowcell's Data folder, ${OUTPUT_DIR} is the directory that you want to output FASTQs to, and ${SAMPLE_SHEET_PATH} is the path to the sample sheet CSV you created.

bcl2fastq Version 2.17 or higher

This is the most common case, for sequencers running RTA 1.18.54 and higher.

$ bcl2fastq --use-bases-mask=Y150,I8,Y150 \
  --create-fastq-for-index-reads \
  --minimum-trimmed-read-length=8 \
  --mask-short-adapter-reads=8 \
  --ignore-missing-positions \
  --ignore-missing-controls \
  --ignore-missing-filter \
  --ignore-missing-bcls \
  -r 6 -w 6 \
  -R ${FLOWCELL_DIR} \
  --output-dir=${OUTPUT_DIR} \
  --interop-dir=${INTEROP_DIR} \
  --sample-sheet=${SAMPLE_SHEET_PATH}

Version 1.8.4

$ configureBclToFastq.pl --use-bases-mask=Y150,I8,Y150 \
  --fastq-cluster-count=20000000 \
  --no-eamss \
  --ignore-missing-bcl \
  --ignore-missing-control \
  --ignore-missing-stats \
  --mismatches=1 \
  --input-dir=${FLOWCELL_DIR}/Data/Intensities/BaseCalls \
  --output-dir=${OUTPUT_DIR}

In both cases, if you want to limit bcl2fastq to a subset of lanes, you will need to supply values to the --tiles argument.

Omitting Extra Bases from Reads

If you add extra bases to a barcode or sample index read, you will need to account for this in the --use-bases-mask argument. For example, if you ran a sample index read with 9 bases, you will need to truncate the last base in order for Long Ranger to run correctly.

You can exclude a single base by adding a single n character to the read argument, or adding n* to exclude all bases after a certain position. See below:

ReadDesiredActualArgument
I189I8n
R198114Y98n*

For samples intended for Long Ranger, you will need to make sure to mask out extra bases in the following read specs: I8.

Running Long Ranger with bcl2fastq FASTQs

After generating FASTQs, you should be able to follow the pipeline instructions, with one caveat. Instead of using the --indices argument to longranger run to select samples, you will use the --fastqprefix argument. The value of --fastqprefix should be the name of the sample, which should have been in the Sample_Name column in your sample sheet. The value of --fastqs should be ${OUTPUT_DIR}/${PROJECT_NAME} where ${OUTPUT_DIR} is as defined above and ${PROJECT_NAME} is the value in the Sample_Project column in the sample sheet.