Potentiel issue excluding silent speaker

Hello there,

Thanks for your efforts in open-sourcing the code, it's vital for us trying to reproduce the result presented in the paper.

### Problem
But I've come across a `RuntimeError` when adapting the model with our private data which shows:
```shell
/*/EEND-vector-clustering/eend/pytorch_backend/train.py:186: RuntimeWarning: invalid value encountered in true_divide
  fet_arr[spk] = org / norm
...
Traceback (most recent call last):
...
RuntimeError: The loss (nan) is not finite.
```

### Detail
After some debugging, I found the problem actually happens during the backpropagation step when there exists an entry left with zeros in the embedding layer: https://github.com/nttcslab-sp/EEND-vector-clustering/blob/b3649eed02fe4f0239f2000fb895120d3c549631/eend/pytorch_backend/train.py#L173-L186

Since the embeddings are actually loaded from the dumped speaker embeddings generated by the `save_spkv_lab.py ` script when adapting the model, I suspect there might exist some issue in the `save_spkv_lab` function.

After some careful step-by-step checking with pdb, I found there is actually some silent speaker label added in the `all_labels` variable when dumping the speaker embeddings: https://github.com/nttcslab-sp/EEND-vector-clustering/blob/b3649eed02fe4f0239f2000fb895120d3c549631/eend/pytorch_backend/infer.py#L349-L355

Even when `if torch.sum(t_chunked_t[sigma[i]]) > 0`, `lab` can still be `-1` which is considered as silent speaker acroding to code in: https://github.com/nttcslab-sp/EEND-vector-clustering/blob/b3649eed02fe4f0239f2000fb895120d3c549631/eend/pytorch_backend/diarization_dataset.py#L94-L99. (This is where makes me feels confused since it should not happen as both lab and T/t_chunked produced with info from kaldi_obj.utt2spk)

Since these silent speaker labels are -1 and the python list support negative indexing, this issue is silently ignored when dumping the embedding but will cause Exceptions when training begins.

### Question
I could simply fix this issue by adding speaker label to `all_labels` only `if lab < 0` when saving speaker embeddings and the followed training process could continue smoothly resulting in a good performing model.

But before opening any PR, I would like to know if you guys have ever come across such an issue or do you have any idea on why this will happen.

Thanks! 

	fet_arr = np.zeros([spk_num, fet_dim])

	# sum
	bs = spklabs.shape[0]
	for i in range(bs):
	if spkidx_tbl[spklabs[i]] == -1:
	raise ValueError(spklabs[i])
	fet_arr[spkidx_tbl[spklabs[i]]] += spkvecs[i]

	# normalize
	for spk in range(spk_num):
	org = fet_arr[spk]
	norm = np.linalg.norm(org, ord=2)
	fet_arr[spk] = org / norm

	for i in range(args.num_speakers):
	# Exclude samples corresponding to silent speaker
	if torch.sum(t_chunked_t[sigma[i]]) > 0:
	vec = outputs[i+1][0].cpu().detach().numpy()
	lab = chunk_data[2][sigma[i]]
	all_outputs.append(vec)
	all_labels.append(lab)

	S_arr = -1 * np.ones(n_speakers).astype(np.int64)
	for seg in filtered_segments:
	speaker_index = speakers.index(self.data.utt2spk[seg['utt']])
	all_speaker_index = self.all_speakers.index(
	self.data.utt2spk[seg['utt']])
	S_arr[speaker_index] = all_speaker_index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potentiel issue excluding silent speaker #3

Problem

Detail

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potentiel issue excluding silent speaker #3

Description

Problem

Detail

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions