Issue with training on my own dataset. 

I have been trying to train Fragment VC model on my own dataset. It works fine with VCTK Dataset, but when I try it with my own dataset, I get the following error. Maybe it has something to do with my dataset and structure. It a non-native English that I am using as my dataset, so I want to find out if I can do VC from say librispeech to non-native English and vice versa. I get the following error and I am not quite sure how to fix it. 

```
root@06089af1684b:/workspace/vc/FragmentVC# CUDA_VISIBLE_DEVICES=1 python train.py features_myst --s
ave_dir ./ckpts_myst --batch_size 16 --preload
100% 17163/17163 [00:18<00:00, 913.63it/s]
Train:   0% 0/1000 [00:00<?, ? step/s]Traceback (most recent call last):
  File "train.py", line 247, in <module>
    main(**parse_args())
  File "train.py", line 166, in main
    batch = next(train_iterator)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
    return self._process_data(data)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
    data.reraise()
  File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 272, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/workspace/vc/FragmentVC/data/intra_speaker_dataset.py", line 73, in __getitem__
    for sampled_id in random.sample(utterance_indices, self.n_samples):
  File "/opt/conda/lib/python3.8/random.py", line 363, in sample
    raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative

Train:   0% 0/1000 [00:01<?, ? step/s]

```
I think it most probably have something to do with the structure of my dataset. Or something to do with length of audio files? I tried looking around but didn't find any working solutions. Any help is appreciated. Thanks in advance. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with training on my own dataset. #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue with training on my own dataset. #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions