-
Notifications
You must be signed in to change notification settings - Fork 3
Dataset sc lung #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Dataset sc lung #27
Conversation
This PR removed or changed files that were very important for the project, including the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Kraftfahrzeughaftpflichtversicherung ! Thanks for your contributions!
I proposed some changes to make this align better with existing components :)
FILE_PATHS = {"file": TMP_DIR / "cropped_sc.h5ad"} | ||
os.system(f'wget http://192.168.2.46:8000/file/cropped_sc.h5ad -P ./tmp/') | ||
adata = ad.read_h5ad( './tmp/cropped_sc.h5ad') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is pointing to local endpoints, which unfortunately won't work.
name: process_nsclc_sc_zuani | ||
namespace: datasets/workflows | ||
|
||
argument_groups: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a section for the input file?
argument_groups: | |
argument_groups: | |
- name: Inputs | |
arguments: | |
- type: file | |
name: --input | |
description: Path to the dataset | |
required: true | |
example: "https://ftp.ebi.ac.uk/biostudies/fire/E-MTAB-/526/E-MTAB-13526/Files/10X_Lung_Tumour_Annotated_v2.h5ad" |
uns_info = { "dataset_id": "E-MTAB-13526" , | ||
"dataset_name":"E-MTAB-13526" , | ||
"dataset_url":"https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-13526" , | ||
"dataset_reference": "https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-13526", | ||
"dataset_summary": 'none', | ||
"dataset_description":'none', | ||
"dataset_organism": 'Homo sapiens' | ||
} | ||
|
||
for key in ["dataset_id", "dataset_name", "dataset_url", "dataset_reference", "dataset_summary", "dataset_description", "dataset_organism"]: | ||
adata.uns[key] = uns_info[key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values like 'dataset_id' and 'dataset_name' are arguments, but are also being hardcoded here. These values should be retrieved from the par and should be passed as part of the dataset script.
uns_info = { "dataset_id": "E-MTAB-13526" , | |
"dataset_name":"E-MTAB-13526" , | |
"dataset_url":"https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-13526" , | |
"dataset_reference": "https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-13526", | |
"dataset_summary": 'none', | |
"dataset_description":'none', | |
"dataset_organism": 'Homo sapiens' | |
} | |
for key in ["dataset_id", "dataset_name", "dataset_url", "dataset_reference", "dataset_summary", "dataset_description", "dataset_organism"]: | |
adata.uns[key] = uns_info[key] | |
for key in ["dataset_id", "dataset_name", "dataset_url", "dataset_reference", "dataset_summary", "dataset_description", "dataset_organism"]: | |
adata.uns[key] = par[key] |
make sure to add a script similar to this one: https://github.com/openproblems-bio/task_ist_preprocessing/blob/ea67087326ae00912e0006d1f643d990576ed414/scripts/create_resources/process_10x_xenium.sh
I tried my best to integrate this dataset, and it even passed tests, failing only in the latest one
Unfortunatelly my computational recources said "bye" for me now, and I need to fix it, so it probably will take some time
Now I will add all that I have for now.

and I didn't change anything in common folder, why is it highlited here ??