Any particular reason for this? (It speeds up a component of rand's [`UniformInt`](https://github.com/rust-lang-nursery/rand/pull/561) by up to x2. We could implement it with the platform intrinsic but that's not ideal)