AT2 CI Build Transition -- CUDA
#13917
Closed
sebrowne
announced in
Announcements
Replies: 1 comment 1 reply
-
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The AT2/cuda11 Action is planned to be set to ‘required’ on Monday 03/31/2025.
Monitoring of this CI build has shown a high pass rate since 03/01/2025. Issues are as follows:
AnalyticPolynomialsMatch_Sacado_Fad_DFadType_Sacado_Fad_DFadType_HierarchicalNodalComparisons_UnitTest
#13915). This test is still turned on, because it is all of the unit tests for Intrepid2, and it does not fail every time. But the failure rate is suspect and may cause issues. Reevaluation may be needed depending on how disruptive this test is to development.The corresponding line (
rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
) that runs through the AutoTester1 system will be set to non-blocking accordingly, and will then soon be decommissioned.No other CI builds will be modified at this time. Announcements will be made for any subsequent CI build transitions.
Please open an issue at https://github.com/trilinos/Trilinos/issues and tag the trilinos/framework team if you encounter any issues that you think are related to this change.
An email will also be sent to the trilinos-developers mailing list.
One notable change is that the new configuration uses A100 (Ampere) GPUs as opposed to V100 (Volta) GPUs. The new hardware also only has 2 GPUs per machine instead of 4, which may affect testing throughput.
Known Issues
If you see an error like the one here and you are an external collaborator, you will need somebody from the internal development team to re-run the failed workflow (it should then get past that issue).
Beta Was this translation helpful? Give feedback.
All reactions