-
I'm quite happy, I found ginkgo - thanks for it! I did not use GPU before and now bought a large workstation with an RTX5080, mainly for Ginkgo. The examples run and mixed-precision-ir cuda works. However my own usage of Ginkgo, which worked finde for ref and omp, fails with a segfault in ginkgo with the cuda executor The mixed-precision-ir example has code like
I suspect, that I also need the host_x stuff (which was not used in the examples, I used for implementation). Is this the case? Where can I find background information? I find it difficult to identify/find the proper location in the code documentation. I don't find information in the tutorials, only a summary about information I do not find. https://github.com/ginkgo-project/ginkgo/wiki/Tutorial-8:-Optimize:-Using-GPUs |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
With a debug build, I confirmed, that the issue is indeed about the matrix and vector handling. Despite the simple-solver example, I have my data locally as CSR and vectors e.g. like
and
|
Beta Was this translation helpful? Give feedback.
The code in my first post was from the mixed-precision-ir.cpp example.
I now have
where only x is generated on
exec->get_master()
, csr and b are pure exec, which is cuda in this case. It seems that x is transferred from host to cuda implicitly.With your help and more investigations I understood the concept and the meaning of exec->get_master() better, thanks a lot!