-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Hi,
thank you for the nice work and for sharing your code!
I believe that your implementation of BEVFormer has a small bug:
simple_bev/nets/bevformernet.py
Line 296 in be46f0e
self.deformable_attention = MSDeformAttn(d_model=dim, n_levels=1, n_heads=4, n_points=8) |
It looks like the values for the parameters n_heads
and n_points
have been swapped compared to the normal initialization
def __init__(self, d_model=256, n_levels=4, n_heads=8, n_points=4): |
See also the original implementation of BEVFormer:
def __init__(self, embed_dims=256, num_heads=8, num_levels=4, num_points=4,
https://github.com/fundamentalvision/BEVFormer/blob/20923e66aa26a906ba8d21477c238567fa6285e9/projects/mmdet3d_plugin/bevformer/modules/decoder.py#L160-L164
as well as the Deformable DETR paper:
M = 8 and K = 4 are set for deformable attentions by default.
K number of sampled keys in each feature level for each attention head
M number of attention heads
I am not sure how much of a difference it is going to make but just to warn other people.