-
Notifications
You must be signed in to change notification settings - Fork 158
Open
Description
Hello!,
I have been trying to use both 16k and 44k version of DAC. I am trying to use the same model with the different number of codebooks. E.g. for 16k I use
# Download a model
model_path = dac.utils.download(model_type="16khz")
model = dac.DAC.load(model_path)
model.to('cuda')
# Load audio signal file
audio, sr = torchaudio.load('input.wav')
audio = audio.unsqueeze(0)
z, codes, latents, _, _ = model.encode(audio.to('cuda'), n_quantizers=12)
The shape of the output codes
tensor is torch.Size([1, 12, 292])
.
However, if I do
z, codes, latents, _, _ = model.encode(audio.to('cuda'), n_quantizers=2)
The output codes still have the same shape, although I would imagine it should be torch.Size([1, 2, 292])
since only two codebooks are used.
Metadata
Metadata
Assignees
Labels
No labels