Skip to content

Find more relevant labels #37

@Berkmann18

Description

@Berkmann18

At the moment, the dataset looks like:
dataset
This is not good! and that's what we have after a down-sampling on the null labels (i.e. labels that can't be classified in one of the categories in https://allcontributors.org/docs/en/emoji-key) which are ≈ 16.61% of the whole dataset (ideally being less than business, ..., userTesting combined).
Down-sampling null labels would be an option, however, most of the ones left seems (fairly) widely used.

So the remaining option is to level up the other categories by adding more labels of those categories, especially the ones that can be found in GH/GL/Bitbucket repos alone.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions