-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
After manually verifying approximately 0.5% of the samples in the dataset, I've identified potential accuracy issues in phoneme labeling. Specifically:
- Through cross-validation using Praat and vLabeler, I found that multiple samples show discrepancies between phoneme segmentation boundaries and actual acoustic boundaries
- These discrepancies mainly manifest as:
- Misalignment between segmentation lines and actual phoneme boundaries
- Notably unreasonable segmentation interval lengths (either too long or too short)
I would like to know:
- The current accuracy level of phoneme labeling in the dataset
- Whether there are any relevant evaluation reports or metrics available
This issue has significant implications for the usage and evaluation of the dataset. Thank you for your attention.
Metadata
Metadata
Assignees
Labels
No labels