Skip to content

Conversation

ieivanov
Copy link
Collaborator

@ieivanov ieivanov commented Oct 11, 2024

This PR largely copies https://github.com/royerlab/imgproc-monorepo/blob/main/impp/impp/pyramid.py expect that pyramids are implemented in CPU rather than GPU as stitches of OPS datasets do not easily fit in GPU memory. We should discuss how we can reduce down to one version of pyramiding functionality - maybe move that to iohub?

TODO:

  • Remove slurmkit dependency
  • Add pyramind to biahub CLI group

cc @ziw-liu @JoOkuma

@Soorya19Pradeep
Copy link
Contributor

Soorya19Pradeep commented Apr 23, 2025

Hello @ieivanov . I am trying to switch to this branch and pip install biahub in the conda environment I created. I am getting the following error:

The conflict is caused by:
biahub 0.1.0 depends on iohub==0.2.0a1
waveorder 2.2.2.dev1378+g8315e00 depends on iohub==0.1.0

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip to attempt to solve the dependency conflict

Could you please help? Thanks!

Edit: Based on talking to @edyoshikun , these conflicts are resolved on main and he said the conflict can be resolved by merging main into this branch.

@ieivanov
Copy link
Collaborator Author

@Soorya19Pradeep I think you should be able to install biahub now. You'll also need to install slurmkit for this PR which is no longer a biahub dependency.

@mattersoflight mattersoflight added this to the Data Infrastructure milestone Aug 14, 2025
@mattersoflight mattersoflight changed the title Add pyramids on cpu based on Joarao's code Add pyramids on cpu based on Jordao's code Aug 14, 2025
@srivarra srivarra self-assigned this Aug 22, 2025
)

target_store = dataset[str(level)].tensorstore()
target_store[:].write(downsampled[:].read().result()).result()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a heads up, I was having some pretty bad memory issues when I was generating the pyramid like this. Memory would spike massivley near the end of writing. Just putting some sort of loop on writing seemed to cap it. Here's what my code looks like here:

        step = dst_ts.chunk_layout.write_chunk.shape[0]
        for start in range(0, downsampled_ts.shape[0], step):
            stop = min(start + step, downsampled_ts.shape[0])
            dst_ts[start:stop].write(downsampled_ts[start:stop]).result()

Though also would downsampled[:].read().result() read the entire source level into memory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, tensorstore is pretty bad with memory efficiency. You probably need some looping that is chunk-aligned here as @ivirshup suggested. Also putting them in a transaction will help tensorstore schedule/consolidate the I/O.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivirshup @ziw-liu Interesting, I'll take a look at chunk-aligned looping and transactions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziw-liu I get the impression tensorstore can be great with memory, just not from the highest level api and there are a lot of foot guns 😆

Also putting them in a transaction will help tensorstore schedule/consolidate the I/O.

I had tried this, but did not see an improvement. Does this work well in your hands? Would love a pointer to some code where this worked so I can give it another shot.

Comment on lines +20 to +26
def pyramid(fov_path: Path, levels: int, method: str) -> None:
"""
Create pyramid levels for a single field of view using tensorstore downsampling.

This function uses cascade downsampling, where each level is downsampled from
the previous level rather than from level 0. This avoids aliasing artifacts
and chunk boundary issues that occur with large downsample factors.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this function move into iohub, say as a method of Position?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants