Skip to content

Error in Continuous Var. Concatenation #104

@cstocker45

Description

@cstocker45

Hey, I came across this issue while trying to run the bayes approach on a rather large dataset - working on integrating some microarray RNA data + morphological features we have calculated.

[INFO - identify_associations]: Perturbing dataset: 'smallCAT_Drug_MOVE'
[INFO - identify_associations]: Beginning task: identify associations categorical
[INFO - identify_associations]: Training models
[INFO - identify_associations]: Identifying significant features
Error executing job with overrides: ['data=Dataset1', 'task=random_small__id_assoc_bayes']
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/move/main.py", line 42, in main
move.tasks.identify_associations(config)
File "/usr/local/lib/python3.10/site-packages/move/tasks/identify_associations.py", line 826, in identify_associations
sig_ids, *extra_cols = _bayes_approach(
File "/usr/local/lib/python3.10/site-packages/move/tasks/identify_associations.py", line 273, in _bayes_approach
bayes_k[i, :] = np.log(prob + 1e-8) - np.log(1 - prob + 1e-8)
ValueError: could not broadcast input array from shape (30,) into shape (1030,)

Looking in more detail at the error, it seems there was an issue combining the 2 continuous datasets - 1 with 30 columns, and the 2nd with 1000 columns - hence shape(1030).

I haven't seen this issue reported by other users - perhaps this is unique to my runtime - I'm running a CPU version of pytorch which could be related.

Please see pull request for working fix - may not work on all systems/datasets

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions