Skip to content

Conversation

ymzayek
Copy link

@ymzayek ymzayek commented Nov 22, 2023

This is an initial implementation of cupy to replace some numpy operations to speed up computation.

It is already a bit faster especially as overlap is increased (e.g. 1.4x faster with the example in example_experimental_data.py when overlap is 5). The example as it is is runnable and you can select engine to be "cpu" or "gpu". Though patch coordinates calculated in get_patch_locs are not used anymore for extracting the patches they are still being used for the recombination as I'm still not sure how this can be done in a more efficient manner.

Note that some of the cupy code mirrors what was done with the pytorch implementation so before merging this code (if that happens) I'd like to give author credit to @achamma723

TODO:

  • Figure out other parts that can be made more efficient
  • Formally compare GPU vs CPU
  • ADD TESTS!

@ymzayek
Copy link
Author

ymzayek commented Nov 22, 2023

More TODO:

  • Run a profiler
  • Run on high resolution data
  • Batching to deal with expected memory issues with high res data + larger patch overlaps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant