Skip to content

Confilict in VQGAN code book loss #427

@yhao-z

Description

@yhao-z

In the code, β times zq-z.detach(),

loss = torch.mean((z_q.detach()-z)**2) + self.beta * torch.mean((z_q - z.detach()) ** 2)

but i think it should times the commitment loss zq.detach()-z to control the learning of Encoder, referred to the CoderFormer and VQVAE papers.
Although the VQVAE paper pointed that

the results did not vary for values of β ranging from 0.1 to 2.0

β (in the code) is placed in the wrong place and set to be 0.25, leading to the weight of the commitment loss, the real β, being 4.0.

Is there a slight possibility that it will affect the performance of VQVAE?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions