Skip to content

Inaccurate Tone Adjustment in tone_sandi #4

Open
@apinge

Description

@apinge

In python version, tone_sandhi is used to adjust Chinese character's segmentations and tones. This issue is to address the inaccuracies in tones in the original Python version. We can both resolve it directly in our C++ version or submit PRs to the original repo.

  • "地" as noun
    e.g. "大地" in “温暖的太阳照耀着大地”
    related code snippet (in _neural_sandhi)
elif len(word) >= 1 and word[-1] in "的地得":
            finals[-1] = finals[-1][:-1] + "5"

Possible Change:

elif len(word) >= 1 and word[-1] in "的地得" and pos[0] != 'n':
            finals[-1] = finals[-1][:-1] + "5"

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions