Skip to content

Commit 41b1cb5

Browse files
LuisHeinzlmeierLuis
andauthored
Add dropdowns to all chapter for key takeaways and environment setup (#333)
* env info for one chapter * changes to preprocessing and visualization section * undo random changes to execution_count * add env info box to the top of all chapters * Revert random changes to muon_to_seurat.ipynb * default text improvements + some design options * design options * add boxes to air_repertoire * boxes for conditions * boxes for cellular-structure + new (unique) anchor logic * updating new anchor logic to existing boxes * first two boxes for chromatin-accessibility * all boxes chromatin accessibility * box for deconvolution * boxes for mechanisms and multimodal integration * boxes for trajectories * boxes for preprocessing * boxes for spatial * boxes for surface protein * boxes for introduction --------- Co-authored-by: Luis <ge34lah@mytum.de>
1 parent 248eb74 commit 41b1cb5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+1792
-377
lines changed
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
1. **Install conda**:
2+
3+
- Before creating the environment, ensure that [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) is installed on your system.
4+
5+
2. **Save the yml content**:
6+
7+
- Copy the content from the yml tab into a file named `environment.yml`.
8+
9+
3. **Create the environment**:
10+
11+
- Open a terminal or command prompt.
12+
- Run the following command:
13+
```bash
14+
conda env create -f environment.yml
15+
```
16+
17+
4. **Activate the environment**:
18+
19+
- After the environment is created, activate it using:
20+
```bash
21+
conda activate <environment_name>
22+
```
23+
- Replace `<environment_name>` with the name specified in the `environment.yml` file. In the yml file it will look like this:
24+
```yaml
25+
name: <environment_name>
26+
```
27+
28+
5. **Verify the installation**:
29+
- Check that the environment was created successfully by running:
30+
```bash
31+
conda env list
32+
```

jupyter-book/air_repertoire/clonotype.ipynb

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,53 @@
1010
"# Clonotype analysis"
1111
]
1212
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "56750b9d",
16+
"metadata": {},
17+
"source": [
18+
"```{dropdown} <i class=\"fas fa-brain\"></i>&nbsp;&nbsp;&nbsp;Key takeaways\n",
19+
"\n",
20+
":::{card}\n",
21+
":link: air-repertoire-clonotype-key-takeaway-1\n",
22+
":link-type: ref\n",
23+
"Clonal Expansion: Specific lymphocytes proliferate in response to immune stimuli, reducing clonal diversity but increasing the abundance of expanded clones.\n",
24+
"This process is critical for understanding immune responses, as it reflects the transition from naive to effector and memory cells.\n",
25+
":::\n",
26+
"\n",
27+
":::{card}\n",
28+
":link: air-repertoire-clonotype-key-takeaway-2\n",
29+
":link-type: ref\n",
30+
"Gene Segment Usage & Spectratype: The rearrangement of V(D)J gene segments shows preferential usage, and spectratype analysis measures CDR3 length diversity.\n",
31+
"Together, they help identify immunodominant clonotypes and characterize immune repertoire patterns.\n",
32+
":::\n",
33+
"\n",
34+
"```\n",
35+
"\n",
36+
"``````{dropdown} <i class=\"fa-solid fa-gear\"></i>&nbsp;&nbsp;&nbsp;Environment setup\n",
37+
"`````{tab-set}\n",
38+
" \n",
39+
"````{tab-item} Steps\n",
40+
"```{include} ../_static/default_text_env_setup.md\n",
41+
"```\n",
42+
"````\n",
43+
"\n",
44+
"````{tab-item} yml\n",
45+
"```{literalinclude} compositional.yml\n",
46+
":language: yaml\n",
47+
"```\n",
48+
"````\n",
49+
"\n",
50+
"`````\n",
51+
"``````"
52+
]
53+
},
1354
{
1455
"cell_type": "markdown",
1556
"id": "b4976e28",
1657
"metadata": {},
1758
"source": [
59+
"(air-repertoire-clonotype-key-takeaway-1)=\n",
1860
"## Clonal expansion: diversity and abundance\n",
1961
"\n",
2062
"In general, lymphocytes are in a dormant state until receiving an external signal (epitope recognition of foreign agent) or stimulation from autocrine agents (signaling from the same organism as a response from the innate immune system). As a consequence, the specific cells proliferate dramatically to fulfill the defense response they are programmed to perform in a process known as clonal expansion {cite}`polonsky2016clonal`. This refers to the recognition of the proliferation of specific cells given the high number of the same IR through many, different cells (expanded clones). This expansion provides hints of differentiation from naive lymphocytes to mature effector and memory lymphocytes, helping in the interpretation and expected results regarding previous cell annotation {cite}`polonsky2016clonal`. On the other hand, the analysis of expanded clones should consider derivative processes such as clonal competitions (two or more clones in expansion competing for the same space), clonal dominance (one single clonal expanded cell outnumbering the rest of the clonal cells), and bystander activation (activation of T-cells by cytokines but not for T-cell receptor coupling) {cite}`naxerova2020clonal`{cite}`ashcroft2017clonal`{cite}`kim2019activation`.\n",
@@ -27,6 +69,7 @@
2769
"id": "3e3cd4d4",
2870
"metadata": {},
2971
"source": [
72+
"(air-repertoire-clonotype-key-takeaway-2)=\n",
3073
"## Gene segment usage and spectratype\n",
3174
"\n",
3275
"The process shaping a T-cell or B-cell receptor by rearrangement of the V(D)J segments is thinking to generate random sequences and, in consequence, the distribution of V(D)J sequences should follow a uniform distribution. Nevertheless, it has been observed that V(D)J gene usage frequency is largely consistent across different individuals, which suggests a preference selection in terms of the V(D)J gene segments used {cite}`elhanati2014quantifying`. That allows the analysis of gene segment usage in terms of abundance of most used gene segments per cell type and frequency of most abundant segment per cell type per individual {cite}`chernyshev2021vdj`. Likewise, considering we know the amino aicd composition of the immune receptors for each cell, it is possible to identify the exact combinations of V(D)J segments of interest.\n",

jupyter-book/air_repertoire/ir_profiling.ipynb

Lines changed: 49 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,51 @@
88
"# Immune Receptor Profiling "
99
]
1010
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "f55a6006",
14+
"metadata": {},
15+
"source": [
16+
"```{dropdown} <i class=\"fas fa-brain\"></i>&nbsp;&nbsp;&nbsp;Key takeaways\n",
17+
"\n",
18+
":::{card}\n",
19+
":link: air-repertoire-ir-profiling-key-takeaway-1\n",
20+
":link-type: ref\n",
21+
"VDJ-sequencing measures the sequence information of AIRs, which provides us insights into the function of B- and T-cells.\n",
22+
":::\n",
23+
"\n",
24+
":::{card}\n",
25+
":link: air-repertoire-ir-profiling-key-takeaway-2\n",
26+
":link-type: ref\n",
27+
"To detect doublets, cells with multiple AIRs should be identified or filtered.\n",
28+
":::\n",
29+
"\n",
30+
":::{card}\n",
31+
":link: air-repertoire-ir-profiling-key-takeaway-3\n",
32+
":link-type: ref\n",
33+
"Depending on the analysis, we additional filter cells with incomplete AIR information.\n",
34+
":::\n",
35+
"\n",
36+
"```\n",
37+
"\n",
38+
"``````{dropdown} <i class=\"fa-solid fa-gear\"></i>&nbsp;&nbsp;&nbsp;Environment setup\n",
39+
"`````{tab-set}\n",
40+
" \n",
41+
"````{tab-item} Steps\n",
42+
"```{include} ../_static/default_text_env_setup.md\n",
43+
"```\n",
44+
"````\n",
45+
"\n",
46+
"````{tab-item} yml\n",
47+
"```{literalinclude} compositional.yml\n",
48+
":language: yaml\n",
49+
"```\n",
50+
"````\n",
51+
"\n",
52+
"`````\n",
53+
"``````"
54+
]
55+
},
1156
{
1257
"cell_type": "markdown",
1358
"id": "81a16281",
@@ -74,6 +119,7 @@
74119
"id": "4b1a926e",
75120
"metadata": {},
76121
"source": [
122+
"(air-repertoire-ir-profiling-key-takeaway-1)=\n",
77123
"## VDJ-sequencing\n",
78124
"\n",
79125
"### Cell isolation\n",
@@ -2633,19 +2679,13 @@
26332679
"adata_bcr.obs.head()"
26342680
]
26352681
},
2636-
{
2637-
"cell_type": "markdown",
2638-
"id": "5e729d42",
2639-
"metadata": {},
2640-
"source": [
2641-
"## Quality Control"
2642-
]
2643-
},
26442682
{
26452683
"cell_type": "markdown",
26462684
"id": "3ab5b0ed",
26472685
"metadata": {},
26482686
"source": [
2687+
"(air-repertoire-ir-profiling-key-takeaway-2)=\n",
2688+
"## Quality Control\n",
26492689
"For analysis, we rely on high quality input data. It is therefore of great importance, to identify cells with incorrect or incomplete AIR information:\n",
26502690
"- **Incomplete AIRs**: A cell is assigned only either a VJ and or a VDJ chain, because the other chain is missed during sequencing. While these are still valid cells, we cannot utilize them for downstream analysis, when full AIR sequence information is required.\n",
26512691
"- **Multiple AIRs**: It is also possible that a cell gets assigned multiple AIRs. While it has been observed that T- / B-cells can express dual AIR {cite}`schuldt2019dual`, cells with more than two IRs are indicative as doublets, which should not be used for downstream analysis.\n",
@@ -2971,6 +3011,7 @@
29713011
"id": "6831ca70",
29723012
"metadata": {},
29733013
"source": [
3014+
"(air-repertoire-ir-profiling-key-takeaway-3)=\n",
29743015
"### Filtering"
29753016
]
29763017
},
@@ -3206,18 +3247,6 @@
32063247
"sc.write(adata=adata_bcr, filename=path_bcr_out)"
32073248
]
32083249
},
3209-
{
3210-
"cell_type": "markdown",
3211-
"id": "d604b650",
3212-
"metadata": {},
3213-
"source": [
3214-
"## Key Takeaways\n",
3215-
"\n",
3216-
"- VDJ-sequencing measures the sequence information of AIRs, which provides us insights into the function of B- and T-cells.\n",
3217-
"- To detect doublets, cells with multiple AIRs should be identified or filtered.\n",
3218-
"- Depending on the analysis, we additional filter cells with incomplete AIR information."
3219-
]
3220-
},
32213250
{
32223251
"cell_type": "markdown",
32233252
"id": "e7b77012",

jupyter-book/air_repertoire/multimodal_integration.ipynb

Lines changed: 57 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,54 @@
88
"# Integrating AIR and transcriptomics"
99
]
1010
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "827c6211",
14+
"metadata": {},
15+
"source": [
16+
"```{dropdown} <i class=\"fas fa-brain\"></i>&nbsp;&nbsp;&nbsp;Key takeaways\n",
17+
"\n",
18+
":::{card}\n",
19+
":link: air-repertoire-multimodal-integration-key-takeaway-1\n",
20+
":link-type: ref\n",
21+
"For multimodal datasets of AIR and GEX, typically, one modality is used for grouping the cells to perform standard uni-modal analysis on the other modality (e.g. Sequence analysis on leiden clusters).\n",
22+
":::\n",
23+
"\n",
24+
":::{card}\n",
25+
":link: air-repertoire-multimodal-integration-key-takeaway-2\n",
26+
":link-type: ref\n",
27+
"Cell functionality (determined by AIR) and cell state (observed via GEX) are interlinked.\n",
28+
"It has been shown, that cells with alike AIR sequences can share similar phenotypes.\n",
29+
":::\n",
30+
"\n",
31+
":::{card}\n",
32+
":link: air-repertoire-multimodal-integration-key-takeaway-3\n",
33+
":link-type: ref\n",
34+
"Due to the inherent structural difference between count matrices (GEX) and amino acid sequences (IR), it is difficult to directly fuse both modalities.\n",
35+
"Therefore, several methods were developed recently, to utilize paired GEX-AIR data for to derive clusters, or an embedding.\n",
36+
"However, these approaches are still novel and not part of a standard analysis pipeline.\n",
37+
":::\n",
38+
"\n",
39+
"```\n",
40+
"\n",
41+
"``````{dropdown} <i class=\"fa-solid fa-gear\"></i>&nbsp;&nbsp;&nbsp;Environment setup\n",
42+
"`````{tab-set}\n",
43+
" \n",
44+
"````{tab-item} Steps\n",
45+
"```{include} ../_static/default_text_env_setup.md\n",
46+
"```\n",
47+
"````\n",
48+
"\n",
49+
"````{tab-item} yml\n",
50+
"```{literalinclude} compositional.yml\n",
51+
":language: yaml\n",
52+
"```\n",
53+
"````\n",
54+
"\n",
55+
"`````\n",
56+
"``````"
57+
]
58+
},
1159
{
1260
"cell_type": "markdown",
1361
"id": "c6b657ac",
@@ -232,11 +280,18 @@
232280
"In this visualization, we see a clear separation between Plasmablasts and B cells."
233281
]
234282
},
283+
{
284+
"cell_type": "markdown",
285+
"id": "ff5fbc79",
286+
"metadata": {},
287+
"source": []
288+
},
235289
{
236290
"cell_type": "markdown",
237291
"id": "b52dc6b3",
238292
"metadata": {},
239293
"source": [
294+
"(air-repertoire-multimodal-integration-key-takeaway-1)=\n",
240295
"## Uni-modal Analysis with multimodal conditions\n",
241296
"\n",
242297
"While studies often provide paired measurements for both modalities, these are often analyzed individually utilizing only limited shared information. Often one modality (AIRR or transcriptome) is used to provide conditions, on which the other modality is then analyzed. E.g., we can observe how cells from the same clonal lineage adapt to perturbation by DEG analysis of these cells in different points in time. \n",
@@ -1087,6 +1142,7 @@
10871142
"id": "939c796e",
10881143
"metadata": {},
10891144
"source": [
1145+
"(air-repertoire-multimodal-integration-key-takeaway-2)=\n",
10901146
"##### Output\n",
10911147
"\n",
10921148
"We now recieve three output files that we can further use for downstream tasks:\n",
@@ -1386,6 +1442,7 @@
13861442
"id": "3888057f",
13871443
"metadata": {},
13881444
"source": [
1445+
"(air-repertoire-multimodal-integration-key-takeaway-3)=\n",
13891446
"#### mvTCR\n",
13901447
"\n",
13911448
"mvTCR by An et al. is a multiview Variational autoencoder that compresses TCR sequence and gene expression into a lower-dimensional representation {cite}`an2021jointly`. Two deep learning architectures - Transformer and Multi-layer perceptron - extract information from both TCR and GEX, respectively, before they are fused to derive the joint space. Following, the trained models can be used to embed similar data. \n",
@@ -2367,18 +2424,6 @@
23672424
"Here, we projected the ten largest clusters upon the transcriptome space. While some clusters follow the same phenotype, other clusters separate in the RNA space. The cluster annotation could be used for downstream analysis (see above) on BCR or transcriptomic site or to infer ancestral relationships between cells (see paper)."
23682425
]
23692426
},
2370-
{
2371-
"cell_type": "markdown",
2372-
"id": "205b4ad6",
2373-
"metadata": {},
2374-
"source": [
2375-
"## Key Takeaways\n",
2376-
"\n",
2377-
"- For multimodal datasets of AIR and GEX, typically, one modality is used for grouping the cells to perform standard uni-modal analysis on the other modality (e.g. Sequence analysis on leiden clusters). \n",
2378-
"- Cell functionality (determined by AIR) and cell state (observed via GEX) are interlinked. It has been shown, that cells with alike AIR sequences can share similar phenotypes.\n",
2379-
"- Due to the inherent structural difference between count matrices (GEX) and amino acid sequences (IR), it is difficult to directly fuse both modalities. Therefore, several methods were developed recently, to utilize paired GEX-AIR data for to derive clusters, or an embedding. However, these approaches are still novel and not part of a standard analysis pipeline."
2380-
]
2381-
},
23822427
{
23832428
"cell_type": "markdown",
23842429
"id": "54051eba",

0 commit comments

Comments
 (0)