Replies: 6 comments
-
Hi @JHYSiu, could you provide us with a reproducible example, using a file/dataset that we can access? I couldn't tell what you were doing when the error occurred. Note, the MuData converters were written by the MuData team and not by us, so we may not be able to help, but still I'd like to see a dataset and code to see what I can do. |
Beta Was this translation helpful? Give feedback.
-
Thanks! Here's a massively subsetted example. Python export code: R import code: R session info
Matrix products: default locale: attached base packages: other attached packages: loaded via a namespace (and not attached): |
Beta Was this translation helpful? Give feedback.
-
Thanks for uploading the example. I have a feeling that the MuData package is going to need some enhancements to work with your use case. Here's how I look at the situation. In the MuData package there is an example that produces an h5mu file from a TCGA MultiAssayExperiment, called miniacc.h5mu. We can use the h5ls utility (get from hdfgroup or brew etc. if you don't already have it) to look at the layout:
That's the top level Group list. Then drill down:
We can see that as we descend in the Group hierarchy we find Dataset instances named X. For your example, which was generated using python and not MuData::writeH5MU, we see
I am not sure that the MuData readH5MU is suited for this structure. You should contact the MuData authors for clarification. In their DESCRIPTION I see
implying that the round trips must start with MAE. Here's where to post your question: https://github.com/ilia-kats/MuData/issues Finally I do not think it will be difficult to dig data out of your .h5mu file using rhdf5 and/or reticulate with h5py imported. You might post a query to support.bioconductor.org where someone may have already tackled this. |
Beta Was this translation helpful? Give feedback.
-
Thanks for looking into this @vjcitn! Helpful for me too. |
Beta Was this translation helpful? Give feedback.
-
For what it's worth the (under development github) anndataR package provides some useful building blocks for a 'native' R parser, and the anndata / mudata structures are very similar. Here's some code that starts down the path (the RNA experiment, obs and assays only...; this uses the 'devel' version of Bioconductor and hence rhdf5, which includes an important extension for managing HDF5 'enum' types).
It's interesting / unfortunate that constructing a MAE loads most of the data (even if the 'large' matrix data were to be left on-disk via TENxMatrix) into memory... |
Beta Was this translation helpful? Give feedback.
-
Excellent thank you everyone. I will have a play around and see if I can get it to work! I'll let you know if I come up with anything robust. :) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi
Sorry, I'm unfamiliar with converting between MuData and R format.
I am unable to load my h5mu data in. I keep getting "Error in h5checktype(). The provided H5Identifier is not a dataset identifier." I'm not sure how to fix this. Thanks!
From the rhdf5::H5Fopen, it appears to still match the format from the dummy example data.
I exported my h5mu with just mdata.write("FILENAME.h5mu")

Beta Was this translation helpful? Give feedback.
All reactions