You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimized arithmetic methods for strided triangular matrices (#52571)
This uses broadcasting for operations like `A::UpperTriangular +
B::UpperTriangular` in case the parents are `StridedMatrix`es. Looping
only over the triangular part is usually faster for large matrices,
where presumably memory is the bottleneck.
Some performance comparisons, using
```julia
julia> U = UpperTriangular(rand(1000,1000));
julia> U1 = UnitUpperTriangular(rand(size(U)...));
```
| Operation | master | PR |
| --------------- | ---------- | ----- |
|`-U` |`1.011 ms (3 allocations: 7.63 MiB)` |`559.680 μs (3 allocations:
7.63 MiB)` |
|`U + U`/`U - U` |`971.740 μs (3 allocations: 7.63 MiB)` | `560.063 μs
(3 allocations: 7.63 MiB)` |
|`U + U1`/`U - U1` |`3.014 ms (9 allocations: 22.89 MiB)` | `944.772 μs
(3 allocations: 7.63 MiB)` |
|`U1 + U1` |`4.509 ms (12 allocations: 30.52 MiB)` | `1.687 ms (3
allocations: 7.63 MiB)` |
|`U1 - U1` |`3.357 ms (9 allocations: 22.89 MiB)` | `1.763 ms (3
allocations: 7.63 MiB)` |
I've retained the existing methods as fallback, in case there's current
code that works without broadcasting.
0 commit comments