scVDJ-Seq Pipeline (CellRanger)
scVDJ-Seq
V(D)J Library Construction |
-
V (Variable): These are gene segments that code for the variable region of an antibody or T-cell receptor. The variable region is responsible for binding to antigens.
-
D (Diversity): These segments are found in some classes of antibodies and in T-cell receptors. They provide an additional level of diversity to the antigen-binding region.
-
J (Joining): These gene segments join with the V (and D, where present) segments to complete the variable region of the receptor.
-
C (Constant): The constant region of the antibody or T-cell receptor is encoded by these segments. This region does not vary much between different antibodies and is responsible for the effector functions of the antibody, such as recruiting other parts of the immune system.
Pipeline
© 10X Genomics |
Install CellRanger
Click the Link and fill out the information and you could get the download page
|
Reference
|
Run
Documentation: 10X Genomics
|
Or De-novo
|
- Command and Arguments:
cellranger vdj: This is the main command being run. `cellranger` is the software package, and `vdj` specifies that you are running the V(D)J analysis pipeline, which is used for assembling and annotating V(D)J sequences from single-cell RNA-Seq data. --id=sample345: This sets the unique identifier for the run. Here, the identifier is `sample345`. This ID is used to name the output directory. --reference=...: This specifies the reference dataset to be used for the analysis. The provided path (`/opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.1.0`) points to a reference dataset for human V(D)J sequences. --fastqs=...: This indicates the directory where the FASTQ files are located. FASTQ files are the input files for the Cell Ranger software, containing the sequenced reads. --sample=mysample: This specifies the name of the sample to be analyzed. It should match the sample name in the FASTQ files. --localcores=8: This parameter tells Cell Ranger to use 8 CPU cores for the computation. This setting helps to optimize the use of available computational resources. --localmem=64: This allocates 64 GB of memory (RAM) for the run. This parameter is crucial for ensuring the software has enough memory to process the data without crashing.
When the Job is Done
A successful cellranger vdj
run should conclude with a message similar to this:
Outputs: - Run summary HTML: /home/jdoe/runs/sample345/outs/web_summary.html - Run summary CSV: /home/jdoe/runs/sample345/outs/metrics_summary.csv - Clonotype info: /home/jdoe/runs/sample345/outs/clonotypes.csv - Filtered contig sequences FASTA: /home/jdoe/runs/sample345/outs/filtered_contig.fasta - Filtered contig sequences FASTQ: /home/jdoe/runs/sample345/outs/filtered_contig.fastq - Filtered contigs (CSV): /home/jdoe/runs/sample345/outs/filtered_contig_annotations.csv - All-contig FASTA: /home/jdoe/runs/sample345/outs/all_contig.fasta - All-contig FASTA index: /home/jdoe/runs/sample345/outs/all_contig.fasta.fai - All-contig FASTQ: /home/jdoe/runs/sample345/outs/all_contig.fastq - Read-contig alignments: /home/jdoe/runs/sample345/outs/all_contig.bam - Read-contig alignment index: /home/jdoe/runs/sample345/outs/all_contig.bam.bai - All contig annotations (JSON): /home/jdoe/runs/sample345/outs/all_contig_annotations.json - All contig annotations (BED): /home/jdoe/runs/sample345/outs/all_contig_annotations.bed - All contig annotations (CSV): /home/jdoe/runs/sample345/outs/all_contig_annotations.csv - Barcodes that are declared to be targetted cells: /home/jdoe/runs/sample345/outs/cell_barcodes.json - Clonotype consensus FASTA: /home/jdoe/runs/sample345/outs/consensus.fasta - Clonotype consensus FASTA index: /home/jdoe/runs/sample345/outs/consensus.fasta.fai - Contig-consensus alignments: /home/jdoe/runs/sample345/outs/consensus.bam - Contig-consensus alignment index: /home/jdoe/runs/sample345/outs/consensus.bam.bai - Clonotype consensus annotations (CSV): /home/jdoe/runs/sample345/outs/consensus_annotations.csv - Concatenated reference sequences: /home/jdoe/runs/sample345/outs/concat_ref.fasta - Concatenated reference index: /home/jdoe/runs/sample345/outs/concat_ref.fasta.fai - Contig-reference alignments: /home/jdoe/runs/sample345/outs/concat_ref.bam - Contig-reference alignment index: /home/jdoe/runs/sample345/outs/concat_ref.bam.bai - Loupe V(D)J Browser file: /home/jdoe/runs/sample345/outs/vloupe.vloupe - V(D)J reference: fasta: regions: /home/jdoe/runs/sample345/outs/vdj_reference/fasta/regions.fa donor_regions: /home/jdoe/runs/sample345/outs/vdj_reference/fasta/donor_regions.fa reference: /home/jdoe/runs/sample345/outs/vdj_reference/reference.json - AIRR Rearrangement TSV: /home/jdoe/runs/sample345/outs/airr_rearrangement.tsv - All contig info (ProtoBuf format): /home/jdoe/runs/sample345/outs/vdj_contig_info.pb Waiting 6 seconds for UI to do final refresh. Pipestance completed successfully!
Once cellranger vdj
has successfully completed, you can browse the resulting summary HTML file in any supported web browser, open the .vloupe
file in Loupe V(D)J Browser, or refer to the Understanding Output section to explore the data by hand.
Trouble Shoot
Too Low to Meet the Required Threshold
[error] Pipestance failed. Error log at: MockC_cs/SC_VDJ_ASSEMBLER_CS/SC_MULTI_CORE/MULTI_CHEMISTRY_DETECTOR/VDJ_CHEMISTRY_DETECTOR/DETECT_VDJ_RECEPTOR/fork0/chnk0-u22ea849f77/_errors Log message: V(D)J Chain detection failed for Sample VDJ-B-293-redo-1 in "/raid/home/wenkanl2/MokC_sc/1_primary_seq". Total Reads = 1000000 Reads mapped to TR = 30 Reads mapped to IG = 28665 In order to distinguish between the TR and the IG chain the following conditions need to be satisfied: - A minimum of 10000 total reads - A minimum of 5.0% of the total reads needs to map to TR or IG - The number of reads mapped to TR should be at least 3.0x compared to the number of reads mapped to IG or vice versa Please check the input data and/or specify the chain via the --chain argument.
The problem here is with the proportion of reads mapping to TR and IG. Even though you have a significant number of reads mapped to IG, the number of reads mapped to TR is too low to meet the required thresholds.
Resolution:
The message suggests checking the input data or specifying the chain via the --chain argument. Explicitly specify whether you are analyzing T-cell receptors (TR) or Immunoglobulins (IG) by using the --chain flag in your Cell Ranger command.
For example, assume that it is B cell data, we could add --chain IG
to solve this problem
scVDJ-Seq Pipeline (CellRanger)