Skip to content

The generated wav is not good #14

@pangtouyuqqq

Description

@pangtouyuqqq

Hi, thank you for open source the wonderful work !
I followed your instructions 1) install lightconv_cuda, 2) download the checkpoint, 3) download the speaker embedding npy.
However, the generated result is not good.

Below is my running command

python3 synthesize.py \
  --text "Hello world" \
  --speaker_id Actor_22 \
  --emotion_id sad \
  --restore_step 450000 \
  --mode single \
  --dataset RAVDESS
# sh run.sh 
2022-11-30 13:45:22.626404: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Device of XSpkEmoTrans: cuda
Removing weight norm...
Raw Text Sequence: Hello world
Phoneme Sequence: {HH AH0 L OW1 W ER1 L D}

ENV

python 3.6.8
fairseq                 0.10.2
torch                   1.7.0+cu110
CUDA 11.0

Hello world_Actor_22_sad

Hello world_Actor_22_sad.wav.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions