Replies: 2 comments 3 replies
-
I managed to narrow it down to one line which causes glam to be slower. let uv = sphere.uv_map(intersection_point); If I replace the it with let uv = Point2::ZERO; # `Vec2::ZERO` for glam glam only takes 15.194 ns while my math implementation takes 24.109 ns This is the function fn uv_map(point: glam::Vec3A) -> glam::Vec2 {
let u = 0.5 + (-point.z).atan2(point.x) * 0.5 * FRAC_1_PI;
let v = 0.5 - point.y.asin() * FRAC_1_PI;
glam::Vec2::new(u, v)
} |
Beta Was this translation helpful? Give feedback.
-
This is usually because some "hot" function is not getting inlined. LTO "thin" is not able to inline functions (or it's limited) whereas LTO "fat" will. glam works around this by adding the Looking at your code, it's hard to say for sure without looking at disassembly or profiling but I'd suggest trying the following:
Also your
If you want to look at asm, I recommend trying https://crates.io/crates/cargo-asm. If things aren't getting inlined it's usually fairly obvious in the asm. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm working on a path tracer and wanted to switch to glam, because it uses SIMD.
When I replaced my own scalar math types (
f64
) with those from glam (Mat4
,Vec3A
) I found that strangely glam is slower.I've made a small benchmark for the ray-sphere intersection routine I use, which can be found here. After a bit of experimenting I found that using
lto = "fat"
causes my math implementation to be significantly faster than glam.Does anybody know why this is the case/how to fix this?
Beta Was this translation helpful? Give feedback.
All reactions