|
91 | 91 | "\n", |
92 | 92 | "### Immune receptor sequencing\n", |
93 | 93 | "\n", |
94 | | - "A common approach to discern V(D)J chains from single-cell isolations consist on computational reconstructions of different chains sequences based on full-length single-cell RNA sequencing, being Smart-seq2, a 5'-end RNA template based protocol, one of the widest implemented. Regarding computational methods, TRAPeS, TraCer, and VDJPuzzle are usually used to reconstruct TCR sequences based on scRNA-seq data, whereas BALDR {cite}`Upadhyay2018`, BASIC {cite}`canzar2016` and BraCer {cite}`Lindeman2018` were shown to robustly recover BCR sequences. However, they are prone to ignore the whole landscape of recombinatorial products and alternative splicing products in V(D)J region. Some alternatives have rise to deal with this problematic, RAGE-seq for example was developed to capture specific TCR and BCR fragments based on PCR templates designed for immune receptor sequencing and use long-read Oxford Nanopore to capture the whole sequence, whereas the rest of the cDNA is processed based on short-reads protocols provided by, for example, Illumina {cite}`singh2019high`. " |
| 94 | + "A common approach to discern V(D)J chains from single-cell isolations consists of computational reconstructions of different chains' sequences based on full-length single-cell RNA sequencing, with Smart-seq2, a 5'-end RNA template based protocol, being one of the most widely implemented. Regarding computational methods, TRAPeS, TraCer, and VDJPuzzle are usually used to reconstruct TCR sequences based on scRNA-seq data, whereas BALDR {cite}`Upadhyay2018`, BASIC {cite}`canzar2016` and BraCer {cite}`Lindeman2018` were shown to robustly recover BCR sequences. However, they are prone to ignore the whole landscape of recombinatorial products and alternative splicing products in V(D)J region. Some alternatives have arisen to deal with this problem, RAGE-seq for example was developed to capture specific TCR and BCR fragments based on PCR templates designed for immune receptor sequencing and use long-read Oxford Nanopore to capture the whole sequence, whereas the rest of the cDNA is processed based on short-reads protocols provided by, for example, Illumina {cite}`singh2019high`. " |
95 | 95 | ] |
96 | 96 | }, |
97 | 97 | { |
|
102 | 102 | "## AIR repertoire analysis\n", |
103 | 103 | "\n", |
104 | 104 | "VDJ-sequencing provides us with the nucleotide and thereby also the protein sequence of the AIR paired for both chains, from which the V-, (D-,) J-, and C-gene is determined in addition to the CDR3 sequence. Overall, the AIR sequence determines the specificity of the individual B- and T-cell. Therefore, the information obtained by VDJ-sequencing provides us with an indicator of the cells' functionality, which is directly coupled to the AIRs target antigen. This enables us to use the AIR information in three major ways:\n", |
105 | | - "- **Phenotyping**: We can group immune cells by identifying cells with the same or similar AIR, which share the same specificity. Having these groups, we can now observe, how disease-specific cells react under different conditions (e.g. transcriptomic change upon stimulation), whether immune cells have proliferated, or how the diversity of an immune repertoire changes upon after an immune response.\n", |
| 105 | + "- **Phenotyping**: We can group immune cells by identifying cells with the same or similar AIR, which share the same specificity. Having these groups, we can now observe, how disease-specific cells react under different conditions (e.g. transcriptomic change upon stimulation), whether immune cells have proliferated, or how the diversity of an immune repertoire changes after an immune response.\n", |
106 | 106 | "- **Sequence Analysis**: Having identified groups of AIRs (e.g. a reactive cluster detected in other modalities), we can extract properties of their sequence, such as V-, D-, and J-, gene usage or enriched sequence motifs, that are related to specific diseases or therapies.\n", |
107 | 107 | "- **Specificity-Inference**: Last, we can use the sequence to match AIRs to their target antigen via database queries, sequence distances, or predictors. This directly identifies cells reactive to specific infectious diseases, tumors, or self-antigens. \n" |
108 | 108 | ] |
|
260 | 260 | "metadata": {}, |
261 | 261 | "source": [ |
262 | 262 | "### Raw data\n", |
263 | | - "We begin by with viewing the raw output of the cell ranger pipeline for a better understanding of the data we are working with.\n", |
| 263 | + "We begin by viewing the raw output of the cell ranger pipeline for a better understanding of the data we are working with.\n", |
264 | 264 | "We will load the `filtered_contig_annotations.csv\"` file to view its content. Each row will represent one measurement of a sequence." |
265 | 265 | ] |
266 | 266 | }, |
|
3033 | 3033 | "text": [ |
3034 | 3034 | "Amount of all B cells:\t\t\t\t159446\n", |
3035 | 3035 | "Amount of B cells with AIR:\t\t\t159185\n", |
3036 | | - "Amount of B cells without dublets:\t\t159185\n", |
| 3036 | + "Amount of B cells without doublets:\t\t159185\n", |
3037 | 3037 | "Amount of B cells with unique AIR per cell:\t153936\n", |
3038 | | - "Amount of B cells with sinlge complete AIR:\t108395\n" |
| 3038 | + "Amount of B cells with single complete AIR:\t108395\n" |
3039 | 3039 | ] |
3040 | 3040 | } |
3041 | 3041 | ], |
|
3045 | 3045 | "print(f\"Amount of B cells with AIR:\\t\\t\\t{len(adata_bcr_tmp)}\")\n", |
3046 | 3046 | "\n", |
3047 | 3047 | "adata_bcr_tmp = adata_bcr_tmp[adata_bcr_tmp.obs[\"chain_pairing\"] != \"multi_chain\"]\n", |
3048 | | - "print(f\"Amount of B cells without dublets:\\t\\t{len(adata_bcr_tmp)}\")\n", |
| 3048 | + "print(f\"Amount of B cells without doublets:\\t\\t{len(adata_bcr_tmp)}\")\n", |
3049 | 3049 | "\n", |
3050 | 3050 | "adata_bcr_tmp = adata_bcr_tmp[\n", |
3051 | 3051 | " ~adata_bcr_tmp.obs[\"chain_pairing\"].isin(\n", |
|
3136 | 3136 | "text": [ |
3137 | 3137 | "Amount of all T cells:\t\t\t\t280045\n", |
3138 | 3138 | "Amount of T cells with AIR:\t\t\t280023\n", |
3139 | | - "Amount of T cells without dublets:\t\t280023\n", |
| 3139 | + "Amount of T cells without doublets:\t\t280023\n", |
3140 | 3140 | "Amount of T cells with unique AIR per cell:\t250160\n", |
3141 | 3141 | "Amount of T cells with sinlge complete AIR:\t196957\n" |
3142 | 3142 | ] |
|
3148 | 3148 | "print(f\"Amount of T cells with AIR:\\t\\t\\t{len(adata_tcr_tmp)}\")\n", |
3149 | 3149 | "\n", |
3150 | 3150 | "adata_tcr_tmp = adata_tcr_tmp[adata_tcr_tmp.obs[\"chain_pairing\"] != \"multi_chain\"]\n", |
3151 | | - "print(f\"Amount of T cells without dublets:\\t\\t{len(adata_tcr_tmp)}\")\n", |
| 3151 | + "print(f\"Amount of T cells without doublets:\\t\\t{len(adata_tcr_tmp)}\")\n", |
3152 | 3152 | "\n", |
3153 | 3153 | "adata_tcr_tmp = adata_tcr_tmp[\n", |
3154 | 3154 | " ~adata_tcr_tmp.obs[\"chain_pairing\"].isin(\n", |
|
0 commit comments