-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Right now both flow potential and cumulative current are solved in the same loop.
e.g.:
flow_potential_storage_array = fill(0.0, (size(resistance)..., nthreads())
cumulative_current_storage_array = fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve flow potential for window i
# add flow potential for window i to the flow potential storage array
# solve current flow for window i
# add current for window i to the cumulative current storage array
end
# sum flow potential along 3rd dim
# sum cumulative current along 3rd dim
so storage arrays (X by Y by N_THREADS; which are then summed along the N_THREADS dimension) need to be allocated for both at the same time... if flow potential was solved first, then it could be summed and stored as an X by Y array, and the storage array removed before allocating the array for storing cumulative current. Shouldn't take any longer at all, but will be much more efficient.
The new code would look like:
flow_potential_storage_array = fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve flow potential for window i
# add flow potential for window i to the flow potential storage array
end
# sum flow potential along 3rd dim (new object of size size(resistance))
flow_potential_storage_array = nothing
cumulative_current_storage_array = fill(0.0, (size(resistance)..., nthreads())
for i in moving_windows
# solve current flow for window i
# add current for window i to the cumulative current storage array
end
# sum cumulative current along 3rd dim
Need to give some more thought to memory management in general, though.... e.g. maybe there is a better way to do this than by allocating a separate array for each thread to make it threadsafe -- I just didn't want to bother with locks/unlocks, as I know that can come with a compute time penalty.