Skip to content

concat_on_disk fails to write alternative axis mapping and uns #1854

@milos7250

Description

@milos7250

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of anndata.
  • (optional) I have confirmed this bug exists on the master branch of anndata.

Report

When concatenating any anndata files that contain a mapping for alternative axis (e.g. concatenating along obs and having varm), the command fails. Moreover, the uns_merge argument is completely ignored.

Code:

import anndata as ad

ad.experimental.concat_on_disk(
    in_files={"1": "sample1.h5ad", "2": "sample2.h5ad"},
    out_file="output.h5ad",
    max_loaded_elems=int(1e9),
    axis=0,
    join="outer",
    merge="unique",
    uns_merge="unique",
    index_unique="-",
)

Traceback:

Traceback (most recent call last):
  File "/mnt/shared/scratch/mmicik/rna-scripts/merge_adatas.py", line 68, in <module>
    ad.experimental.concat_on_disk(
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/experimental/merge.py", line 630, in concat_on_disk
    _write_alt_mapping(groups, output_group, alt_axis_name, alt_index, merge)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/experimental/merge.py", line 381, in _write_alt_mapping
    write_elem(output_group, alt_axis_name, alt_mapping)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 487, in write_elem
    Writer(_REGISTRY).write_elem(store, k, elem, dataset_kwargs=dataset_kwargs)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 354, in write_elem
    return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 71, in wrapper
    result = func(g, k, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/methods.py", line 353, in write_mapping
    _writer.write_elem(g, sub_k, sub_v, dataset_kwargs=dataset_kwargs)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 351, in write_elem
    write_func = self.find_write_func(dest_type, elem, modifiers)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 318, in find_write_func
    return self.registry.get_write(dest_type, type(elem), modifiers, writer=self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 134, in get_write
    raise IORegistryError._from_write_parts(dest_type, src_type, modifiers)
anndata._io.specs.registry.IORegistryError: No method registered for writing <class 'pandas.core.series.Series'> into <class 'h5py._hl.group.Group'>
Error raised while writing key 'gene_id' of <class 'h5py._hl.group.Group'> to /var

Upon examining the source code, I have noticed that in the _write_alt_mapping function, the mapping data (corresponding to varm group) is attempting to be written into var. After changing this part, concatenation completes, but uns is still missing in the result. Well, turns out the reason for this is that the code meant to merge uns is also missing.

I've had my go at fixing these issues, see my commit. For my data, it works with no problems. I was hesitant to create a PR with this fix, as I have not written appropriate tests for my change, but here it is #1855.

Versions

| Package      | Version |
| ------------ | ------- |
| anndata      | 0.11.3  |
| numpy        | 2.1.3   |
| pandas       | 2.2.3   |
| flatten_json | 0.1.14  |
| tqdm         | 4.67.1  |

| Dependency         | Version     |
| ------------------ | ----------- |
| natsort            | 8.4.0       |
| python-dateutil    | 2.9.0.post0 |
| charset-normalizer | 3.4.1       |
| Deprecated         | 1.2.18      |
| wrapt              | 1.17.2      |
| Cython             | 3.0.12      |
| asciitree          | 0.3.3       |
| pytz               | 2024.1      |
| session-info2      | 0.1.2       |
| zarr               | 2.18.4      |
| h5py               | 3.12.1      |
| setuptools         | 75.8.0      |
| six                | 1.17.0      |
| msgpack            | 1.1.0       |
| numcodecs          | 0.15.1      |
| packaging          | 24.2        |
| scipy              | 1.15.1      |

| Component | Info                                                                          |
| --------- | ----------------------------------------------------------------------------- |
| Python    | 3.12.8 | packaged by conda-forge | (main, Dec  5 2024, 14:24:40) [GCC 13.3.0] |
| OS        | Linux-6.1.0-31-amd64-x86_64-with-glibc2.36                                    |
| Updated   | 2025-02-12 17:43                                                              |

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions