-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Hey, I came across this issue while trying to run the bayes approach on a rather large dataset - working on integrating some microarray RNA data + morphological features we have calculated.
[INFO - identify_associations]: Perturbing dataset: 'smallCAT_Drug_MOVE'
[INFO - identify_associations]: Beginning task: identify associations categorical
[INFO - identify_associations]: Training models
[INFO - identify_associations]: Identifying significant features
Error executing job with overrides: ['data=Dataset1', 'task=random_small__id_assoc_bayes']
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/move/main.py", line 42, in main
move.tasks.identify_associations(config)
File "/usr/local/lib/python3.10/site-packages/move/tasks/identify_associations.py", line 826, in identify_associations
sig_ids, *extra_cols = _bayes_approach(
File "/usr/local/lib/python3.10/site-packages/move/tasks/identify_associations.py", line 273, in _bayes_approach
bayes_k[i, :] = np.log(prob + 1e-8) - np.log(1 - prob + 1e-8)
ValueError: could not broadcast input array from shape (30,) into shape (1030,)
Looking in more detail at the error, it seems there was an issue combining the 2 continuous datasets - 1 with 30 columns, and the 2nd with 1000 columns - hence shape(1030).
I haven't seen this issue reported by other users - perhaps this is unique to my runtime - I'm running a CPU version of pytorch which could be related.
Please see pull request for working fix - may not work on all systems/datasets