Skip to content

ValueError when using swin_upernet models #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SUC-DriverOld opened this issue Sep 23, 2024 · 1 comment
Closed

ValueError when using swin_upernet models #24

SUC-DriverOld opened this issue Sep 23, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@SUC-DriverOld
Copy link
Owner

SUC-DriverOld commented Sep 23, 2024

ValueError when using swin_upernet models, The solution can refer to ZFTurbo/Music-Source-Separation-Training/issues/6

ValueError: Make sure that the channel dimension of the pixel values match with the one set in the configuration.
@SUC-DriverOld SUC-DriverOld added bug Something isn't working enhancement New feature or request labels Sep 23, 2024
@SUC-DriverOld SUC-DriverOld pinned this issue Sep 23, 2024
@SUC-DriverOld SUC-DriverOld self-assigned this Sep 23, 2024
@SUC-DriverOld
Copy link
Owner Author

SUC-DriverOld commented Sep 24, 2024

ValueError when using swin_upernet models

ValueError: Make sure that the channel dimension of the pixel values match with the one set in the configuration.

Follow the steps below to solve this problem:

  1. Go to site-packages\transformers\models\swin\modeling_swin.py
  2. Modify the function code on line 312 to the following:
    def forward(self, pixel_values: Optional[torch.FloatTensor]) -> Tuple[torch.Tensor, Tuple[int]]:
        _, num_channels, height, width = pixel_values.shape
        if num_channels != self.num_channels:
            self.num_channels = num_channels
        # pad the input to be divisible by self.patch_size, if needed
        pixel_values = self.maybe_pad(pixel_values, height, width)
        embeddings = self.projection(pixel_values)
        _, _, height, width = embeddings.shape
        output_dimensions = (height, width)
        embeddings = embeddings.flatten(2).transpose(1, 2)

        return embeddings, output_dimensions

SUC-DriverOld added a commit that referenced this issue Sep 24, 2024
@SUC-DriverOld SUC-DriverOld unpinned this issue Sep 24, 2024
@SUC-DriverOld SUC-DriverOld changed the title Bug&Feat: NEED TO FIX ValueError when using swin_upernet models Oct 29, 2024
@SUC-DriverOld SUC-DriverOld removed the enhancement New feature or request label Oct 29, 2024
@SUC-DriverOld SUC-DriverOld pinned this issue May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant