You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+43-2Lines changed: 43 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -95,6 +95,8 @@ PIPEX runs sequentially a series of commands written in the file `pipex_batch_li
95
95
96
96
There are currently available the following commands:
97
97
98
+
-`run_id <run identifier>` : an optional command to number your current PIPEX run with an identifier. Useful if you want to integrate PIPEX in a bigger pipeline, like PyShowwwcase. **OBS**: PIPEX will assign a random run_id if you don't add this command.
99
+
98
100
-`swap <number of GB>` : Linux only, it will generate a temporary swap space in the installation folder with the specified size; the space will be automatically deleted at the end of the full PIPEX process. **OBS**: it will require root permission/password while executing.
99
101
100
102
-`segmentation.py` : performs PIPEX segmentation. Uses the following parameters:
@@ -254,14 +256,24 @@ If you add the `generate_geojson` command to PIPEX command list a `cell_segmenta
254
256
255
257
256
258
TissUUmaps integration
257
-
------------------
259
+
----------------------
258
260
259
261
If you add the `generate_tissuumaps` command to PIPEX command list a `anndata_TissUUmaps.h5ad` file will be generated in your analysis/downstream sub-folder. You can open this file in TissUUmaps. To do so:
- Load the `anndata_TissUUmaps.h5ad` file in TissUUmaps
262
264
263
265
If you add the `include_html=yes` parameter to the `generate_tissuumaps` command, a `TissUUmaps_webexport` folder will be generated in your analysis/downstream sub-folder. You can share this file on a web server, and access it from any web browser.
264
266
267
+
**NOTE**: TissUUmaps requires your images to be in `TIFF` format and be name exactly as your markers (for example: `DAPI.tif`, `CPEP.tif`, etc...)
268
+
269
+
270
+
Pipeline integration
271
+
--------------------
272
+
273
+
PIPEX can be integrated as a step in a bigger pipeline or queue.
274
+
- By default, a random `run_id` is assigned to every PIPEX operations batch and a file named `LAST_RUN_ID` containing the same identifier is generated in the root folder once the process is finished.
275
+
- You can add a file in the root folder named `run_id.txt` containing a specific identifier if you want to force PIPEX to use it for the next run. The `LAST_RUN_ID` file will be updated accordingly when the process is finished.
276
+
- You can also directly specify a run identifier by the PIPEX command `run_id`
265
277
266
278
267
279
Annex 1: Detailed segmentation explanation
@@ -403,4 +415,33 @@ PIPEX's analysis step includes an optional marker filtration commonly used in Ce
403
415
-`CDH1` 1% top ranked intensities cell removal
404
416
-`CTNNB1` 1% top ranked intensities cell removal
405
417
406
-
Please make sure you the name of your marker column is a stric match with the aforementioned ones
418
+
Please make sure you the name of your marker column is a strict match with the aforementioned ones
419
+
420
+
421
+
Annex 4: Cluster refinement procedure
422
+
-------------------------------------
423
+
424
+
PIPEX's analysis step includes the possibility to refine the unsupervised clustering results (leiden and/or kmeans). This can help you with the manual annotation and merging of the clusters automatically discovered.
425
+
426
+
The idea behind the cluster refinement algorithm is to explore the ranked genes associated to each cluster and try to match them with rules stated by the user. The algorithm then assigns a confidence score per cluster and rule, depending how close its ranked genes are to the rule/s definition/s. Finally, the refinement picks per cluster the annotated cluster with higher confidence (ties are solved by row order).
427
+
428
+
To use the cluster refinement, you have to create a `cell_types.csv` file with rows containing the following information:
429
+
-`cell_group`: used as a prefix for the manually annotated cluster name. The final cluster name will be `[cell_group]-[cell_type]-[cell_subtype]`
430
+
-`cell_type`: used as a interfix for the manually annotated cluster name. The final cluster name will be `[cell_group]-[cell_type]-[cell_subtype]`
431
+
-`cell_subtype`: used as a suffix for the manually annotated cluster name. The final cluster name will be `[cell_group]-[cell_type]-[cell_subtype]`
432
+
-`rank_filter`: used to direct the refinement procedure to use only certain ranked genes. Default is `all` (no filtering), you can use `positive_only` (use only ranked genes with positive values)
433
+
-`min_confidence`: by default, the refinement procedure aggresively merges all clusters that minimally fullfil the indicated rules. You can force the process to be more strict by using a higher `min_confidence` probability (values from 0 to 100)
434
+
-`marker[n]` and `rule[n]` pairs: you can add an arbritary amount (at least one!) of marker rules to guide the algorithm how to annotate/merge the automatically discovered clusters. The marker value must match one of your analysis markers and the rule states how the marker should be relatively placed amongst the ranked genes (values `high`,`medium`,`low`)
435
+
436
+
Here's and example of how a `cell_types.csv` file usually looks:
0 commit comments