Skip to content

About phoneme label's accuracy rate #10

@colstone

Description

@colstone

After manually verifying approximately 0.5% of the samples in the dataset, I've identified potential accuracy issues in phoneme labeling. Specifically:

  1. Through cross-validation using Praat and vLabeler, I found that multiple samples show discrepancies between phoneme segmentation boundaries and actual acoustic boundaries
  2. These discrepancies mainly manifest as:
    • Misalignment between segmentation lines and actual phoneme boundaries
    • Notably unreasonable segmentation interval lengths (either too long or too short)

I would like to know:

  1. The current accuracy level of phoneme labeling in the dataset
  2. Whether there are any relevant evaluation reports or metrics available

This issue has significant implications for the usage and evaluation of the dataset. Thank you for your attention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions