Thank you the nice library! I am trying to figure out the advantages of WarpConvNet's PTV3 implementation w.r.t the default PTV3 implementation. 1. Is there any benchmarking (memory & time) script for PTV3 or any of its components? 2. when is `wp.init()` used? For example, in [test_attention.py](https://github.com/NVlabs/WarpConvNet/blob/2cde94fadd5cab9b9685aa35b52dd340f05e61f3/tests/nn/test_attention.py#L31-L41) it is used whereas in [point_transformerv3.py](https://github.com/NVlabs/WarpConvNet/blob/2cde94fadd5cab9b9685aa35b52dd340f05e61f3/examples/point_transformer_v3.py#L276-L285) it is not.