-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Hey
I know that you guys optimized this project for the A100, and i read that people got the 4090 and the 3090 running. I am only able to work with 2080s (University).
When i try to run your code (amg_example.py), im getting the following errors :
torch._inductor.utils: [WARNING] not enough SMs to use max_autotune_gemm mode
followed by a bunch of "code" and then:
BackendCompilerFailed: backend='inductor' raised:
RuntimeError: Internal Triton PTX codegen error:
ptxas /tmp/compile-ptx-src-76618e, line 149; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-76618e, line 149; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
(.....)
ptxas /tmp/compile-ptx-src-76618e, line 200; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-76618e, line 200; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas fatal : Ptx assembly aborted due to error
Is it just a shortcoming of my hardware or is there anything i am doing wrong.
PS: the Original model runs fine and your project runs as well if i use "sam_model_registry" (i guess that is just the meta implementation)
Thank you.