See vocaloid_pyneuralfx/Pyneuralfx.ipynb
, run the blocks, you can run the grid_search first to find the best configurations (you have to modify the search space in vocaloid_pyneuralfx/Pyneuralfx/frame_work/grid_search.py
), then manually modify the snapshot model configuration in vocaloid_pyneuralfx/Pyneuralfx/configs/cnn/tcn/snapshot_tcn.yml
to the best configuration. Then run vocaloid_pyneuralfx/Pyneuralfx/frame_work/main_snapshot.py
for best model training. After you complete training and inferenceing, the inferenced file will be stored in vocaloid_pyneuralfx/Pyneuralfx/frame_work/exp/vocaloid/valid_gen
, or you can just listen to the validation file listed in the last block.
Training data is in data/audio/
, the audio sv_teto_X_X_X_X_MixDown.wav
are the audio file for training, the parameters are {loudness}_{?}_{?}_{gender}
.