Skip to content

[ENH] Add a label encoder option when loading dataset #2807

@baraline

Description

@baraline

Describe the feature or idea you want to propose

When loading datasets, we often get class labels as strings, which is not the best case when wanting to do plotting or other operation (e.g. plotting samples coloured by their class label).

It would be nice to have an option as a boolean parameter to apply (and return) a label encoder directly when loading, instead of having to do it outside this function every time we want too.

Describe your proposed solution

Add a boolean parameter (default to false to not change default behaviour) that applies and returns a label encoder to the class labels when loading classification datasets.

Some thinking might be needed on how to handle properly the case where we load the training data, and then need to re-use the label encoder for the testing data.

Describe alternatives you've considered, if relevant

No response

Additional context

I often need to do this when loading classification datasets, so I might not be the only one (hopefuly!)

Metadata

Metadata

Assignees

No one assigned

    Labels

    datasetsDatasets and data loadersenhancementNew feature, improvement request or other non-bug code enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions