[BUG]  naflexvit_so400m_patch16_siglip has undocumented different default pos_embed_interp_mode of "bicubic" instead of "bilinear"

# Updates

Per further discussion, the difference is intentional, but undocumented. It is a difference with the reference implementation from Google Big Vision.

---

# Original Report

Fix location:
https://github.com/huggingface/pytorch-image-models/blob/a7c5368ba0c8713dc1c9a98cc83bf46ddd02b0a0/timm/models/naflexvit.py#L1767

This causes the default to be "bicubic":
https://github.com/huggingface/pytorch-image-models/blob/a7c5368ba0c8713dc1c9a98cc83bf46ddd02b0a0/timm/models/naflexvit.py#L90

Reference code showing "bilinear" interpolation:
https://github.com/google-research/big_vision/blob/0127fb6b337ee2a27bf4e54dea79cff176527356/big_vision/models/proj/image_text/naflex_vit.py#L67

After making this change, TIMM is able to forward siglip2 naflex with cosine similarly at each intermediate above 0.9999.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG] naflexvit_so400m_patch16_siglip has undocumented different default pos_embed_interp_mode of "bicubic" instead of "bilinear" #2542

Updates

Original Report

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] naflexvit_so400m_patch16_siglip has undocumented different default pos_embed_interp_mode of "bicubic" instead of "bilinear" #2542

Description

Updates

Original Report

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions