-
Notifications
You must be signed in to change notification settings - Fork 243
Expand eigen() and add eig[vals,vecs]() #2787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Expand eigen() and add eig[vals,vecs]() #2787
Conversation
- Expand LinearAlgebra.eigen() to handle non-symmetric matrices via recent `Xgeev` - Add LinearAlgebra.eigvals() and LinearAlgebra.eigvecs()
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/test/libraries/cusolver/dense.jl b/test/libraries/cusolver/dense.jl
index ecc9078ed..71ad9ce83 100644
--- a/test/libraries/cusolver/dense.jl
+++ b/test/libraries/cusolver/dense.jl
@@ -333,31 +333,31 @@ sorteig!(λ::AbstractVector, sortby::Union{Function, Nothing} = eigsortby) = sor
@testset "geev!" begin
local d_W, d_V
- A = rand(elty,m,m)
- d_A = CuArray(A)
- Eig = eigen(A)
- d_eig = eigen(d_A)
+ A = rand(elty, m, m)
+ d_A = CuArray(A)
+ Eig = eigen(A)
+ d_eig = eigen(d_A)
sorteig!(d_eig.values, d_eig.vectors)
@test Eig.values ≈ collect(d_eig.values)
- h_V = collect(d_eig.vectors)
- h_V⁻¹ = inv(h_V)
- @test abs.(h_V⁻¹*Eig.vectors) ≈ I
+ h_V = collect(d_eig.vectors)
+ h_V⁻¹ = inv(h_V)
+ @test abs.(h_V⁻¹ * Eig.vectors) ≈ I
- A = rand(elty,m,m)
- d_A = CuArray(A)
- W = eigvals(A)
- d_W = eigvals(d_A)
+ A = rand(elty, m, m)
+ d_A = CuArray(A)
+ W = eigvals(A)
+ d_W = eigvals(d_A)
sorteig!(d_W)
- @test W ≈ collect(d_W)
+ @test W ≈ collect(d_W)
- A = rand(elty,m,m)
- d_A = CuArray(A)
- V = eigvecs(A)
- d_W = eigvals(d_A)
- d_V = eigvecs(d_A)
+ A = rand(elty, m, m)
+ d_A = CuArray(A)
+ V = eigvecs(A)
+ d_W = eigvals(d_A)
+ d_V = eigvecs(d_A)
sorteig!(d_W, d_V)
- V⁻¹ = inv(V)
- @test abs.(V⁻¹*collect(d_V)) ≈ I
+ V⁻¹ = inv(V)
+ @test abs.(V⁻¹ * collect(d_V)) ≈ I
end
end
@@ -402,7 +402,7 @@ sorteig!(λ::AbstractVector, sortby::Union{Function, Nothing} = eigsortby) = sor
d_A = CuArray(A)
Eig = eigen(LinearAlgebra.Hermitian(A))
if CUSOLVER.version() >= v"11.7.1"
- d_eig = eigen(d_A)
+ d_eig = eigen(d_A)
sorteig!(d_eig.values, d_eig.vectors)
@test Eig.values ≈ collect(d_eig.values)
end
@@ -418,42 +418,42 @@ sorteig!(λ::AbstractVector, sortby::Union{Function, Nothing} = eigsortby) = sor
@test abs.(Eig.vectors'*h_V) ≈ I
end
- A = rand(elty,m,m)
- A += A'
- d_A = CuArray(A)
- W = eigvals(LinearAlgebra.Hermitian(A))
+ A = rand(elty, m, m)
+ A += A'
+ d_A = CuArray(A)
+ W = eigvals(LinearAlgebra.Hermitian(A))
if CUSOLVER.version() >= v"11.7.1"
- d_W = eigvals(d_A)
+ d_W = eigvals(d_A)
sorteig!(d_W)
- @test W ≈ collect(d_W)
+ @test W ≈ collect(d_W)
end
- d_W = eigvals(LinearAlgebra.Hermitian(d_A))
- @test W ≈ collect(d_W)
+ d_W = eigvals(LinearAlgebra.Hermitian(d_A))
+ @test W ≈ collect(d_W)
if elty <: Real
- W = eigvals(LinearAlgebra.Symmetric(A))
- d_W = eigvals(LinearAlgebra.Symmetric(d_A))
- @test W ≈ collect(d_W)
+ W = eigvals(LinearAlgebra.Symmetric(A))
+ d_W = eigvals(LinearAlgebra.Symmetric(d_A))
+ @test W ≈ collect(d_W)
end
- A = rand(elty,m,m)
- A += A'
- d_A = CuArray(A)
- V = eigvecs(LinearAlgebra.Hermitian(A))
+ A = rand(elty, m, m)
+ A += A'
+ d_A = CuArray(A)
+ V = eigvecs(LinearAlgebra.Hermitian(A))
if CUSOLVER.version() >= v"11.7.1"
- d_W = eigvals(d_A)
- d_V = eigvecs(d_A)
+ d_W = eigvals(d_A)
+ d_V = eigvecs(d_A)
sorteig!(d_W, d_V)
- h_V = collect(d_V)
- @test abs.(V'*h_V) ≈ I
+ h_V = collect(d_V)
+ @test abs.(V' * h_V) ≈ I
end
- d_V = eigvecs(LinearAlgebra.Hermitian(d_A))
- h_V = collect(d_V)
- @test abs.(V'*h_V) ≈ I
+ d_V = eigvecs(LinearAlgebra.Hermitian(d_A))
+ h_V = collect(d_V)
+ @test abs.(V' * h_V) ≈ I
if elty <: Real
- V = eigvecs(LinearAlgebra.Symmetric(A))
- d_V = eigvecs(LinearAlgebra.Symmetric(d_A))
- h_V = collect(d_V)
- @test abs.(V'*h_V) ≈ I
+ V = eigvecs(LinearAlgebra.Symmetric(A))
+ d_V = eigvecs(LinearAlgebra.Symmetric(d_A))
+ h_V = collect(d_V)
+ @test abs.(V' * h_V) ≈ I
end
end
|
Thanks. Can you add some tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 3d0f36d | Previous: 1004fd9 | Ratio |
---|---|---|---|
latency/precompile |
43031373202 ns |
43347615330 ns |
0.99 |
latency/ttfp |
7173263952 ns |
7144432561 ns |
1.00 |
latency/import |
3423781003 ns |
3422583362 ns |
1.00 |
integration/volumerhs |
9605952 ns |
9621539.5 ns |
1.00 |
integration/byval/slices=1 |
147112 ns |
147204 ns |
1.00 |
integration/byval/slices=3 |
425992 ns |
426126.5 ns |
1.00 |
integration/byval/reference |
145268 ns |
145306 ns |
1.00 |
integration/byval/slices=2 |
286683.5 ns |
286683 ns |
1.00 |
integration/cudadevrt |
103590 ns |
103645 ns |
1.00 |
kernel/indexing |
14248 ns |
14297 ns |
1.00 |
kernel/indexing_checked |
14914 ns |
14899 ns |
1.00 |
kernel/occupancy |
707.7708333333334 ns |
742.9652777777778 ns |
0.95 |
kernel/launch |
2215.1111111111113 ns |
2214.8888888888887 ns |
1.00 |
kernel/rand |
15562 ns |
18344.5 ns |
0.85 |
array/reverse/1d |
19675 ns |
19882 ns |
0.99 |
array/reverse/2d |
23676 ns |
23850 ns |
0.99 |
array/reverse/1d_inplace |
10240 ns |
10508 ns |
0.97 |
array/reverse/2d_inplace |
11832 ns |
12080 ns |
0.98 |
array/copy |
21336 ns |
21206 ns |
1.01 |
array/iteration/findall/int |
158355 ns |
159710.5 ns |
0.99 |
array/iteration/findall/bool |
139132.5 ns |
139788 ns |
1.00 |
array/iteration/findfirst/int |
162389.5 ns |
163426.5 ns |
0.99 |
array/iteration/findfirst/bool |
163669.5 ns |
164898.5 ns |
0.99 |
array/iteration/scalar |
71620 ns |
73632 ns |
0.97 |
array/iteration/logical |
215127 ns |
219076 ns |
0.98 |
array/iteration/findmin/1d |
46710 ns |
47341 ns |
0.99 |
array/iteration/findmin/2d |
97023 ns |
97310 ns |
1.00 |
array/reductions/reduce/1d |
36311.5 ns |
36812 ns |
0.99 |
array/reductions/reduce/2d |
42188 ns |
41111 ns |
1.03 |
array/reductions/mapreduce/1d |
34148.5 ns |
35000.5 ns |
0.98 |
array/reductions/mapreduce/2d |
47730.5 ns |
41081 ns |
1.16 |
array/broadcast |
21165 ns |
21375 ns |
0.99 |
array/copyto!/gpu_to_gpu |
11256 ns |
12932 ns |
0.87 |
array/copyto!/cpu_to_gpu |
214394 ns |
217571 ns |
0.99 |
array/copyto!/gpu_to_cpu |
285576 ns |
286553 ns |
1.00 |
array/accumulate/1d |
109412 ns |
109198 ns |
1.00 |
array/accumulate/2d |
80595.5 ns |
81190 ns |
0.99 |
array/construct |
1250.4 ns |
1248.9 ns |
1.00 |
array/random/randn/Float32 |
43357 ns |
44976.5 ns |
0.96 |
array/random/randn!/Float32 |
25041 ns |
25430 ns |
0.98 |
array/random/rand!/Int64 |
27195 ns |
27147 ns |
1.00 |
array/random/rand!/Float32 |
8701 ns |
8919.333333333334 ns |
0.98 |
array/random/rand/Int64 |
30059 ns |
30142 ns |
1.00 |
array/random/rand/Float32 |
13115 ns |
13307 ns |
0.99 |
array/permutedims/4d |
61299 ns |
61252 ns |
1.00 |
array/permutedims/2d |
55151 ns |
55576 ns |
0.99 |
array/permutedims/3d |
56054 ns |
56240 ns |
1.00 |
array/sorting/1d |
2761352 ns |
2778491 ns |
0.99 |
array/sorting/by |
3368586.5 ns |
3369601 ns |
1.00 |
array/sorting/2d |
1086296.5 ns |
1087382 ns |
1.00 |
cuda/synchronization/stream/auto |
1040.5 ns |
1018.7 ns |
1.02 |
cuda/synchronization/stream/nonblocking |
7264 ns |
8223 ns |
0.88 |
cuda/synchronization/stream/blocking |
803.7444444444444 ns |
800.9072164948453 ns |
1.00 |
cuda/synchronization/context/auto |
1193.6 ns |
1159.8 ns |
1.03 |
cuda/synchronization/context/nonblocking |
7195.4 ns |
7478.9 ns |
0.96 |
cuda/synchronization/context/blocking |
929.7777777777778 ns |
897.4426229508197 ns |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
I've added some, though most fail due to the fact that Should we return ordered eigenpairs, too? Or should we just order them in the tests alone? Reordering could impact performance. |
I've simplified a bit the Unfortunately, it seems that the output of As for FInally, I still cannot get the two |
It turns out I accidentally overlooked the final basis transformation necessary for real The only points left are whether to do eigenvalues/vectors sorting or not, and what to do with the commented out |
Lacking familiarity, some superficial thoughts.
Given that sorting is relatively expensive, and IIUC does not practically matter if only for convenience/reproducibility, I'd only do this in the tests.
What functionality is missing? We typically do not "overload" LAPACK methods like |
Great, I'll do that.
The Lines 1053 to 1063 in 8b6a2a0
which in turn use these definitions: Lines 637 to 672 in 8b6a2a0
is missing. The definitions above use a "legacy" API, with type-dependent function names (e.g. The generic API functions are translated in CUDA.jl/lib/cusolver/dense_generic.jl Lines 387 to 418 in 8b6a2a0
So, while the symmetric diagonalization routines have both the "legacy" and "generic" API, the non-symmetric diagonalization routine has just the "generic" API This results in a somewhat unbalanced situation in the existing codebase. Calling e.g. |
It's unfortunate that we (still) have these (again). The Base LAPACK interface isn't really meant to be extended; IME it's better to implement interfaces at a higher level (e.g. LinearAlgebra) and dispatch to methods specific to CUDA libraries. There are also often incompatibilities between the CPU and GPU LAPACK interfaces that prevent this kind of code reuse. |
Sounds great, I'll then remove the commented out tests. As a side note, I've tried to adhere to the whitespace style of current tests, but runic is not happy; do you really want me to reformat as runic says? |
No need. |
Xgeev