Label assertion and mapping in Machine

Currently some classification algorithms check whether the input Labels are valid, e.g. the class labels are continuous `[0, 1, ..., n_classes-1]`, which leads to a lot of duplicate code.
These checks should be done by the Machine base class when training is performed. The Machine will then store the mapping of any Label input to an internal encoding, e.g. a binary classification task would map {10,20} -> {-1,+1} using a `BinaryLabelEncoder` class, and similarly there would be a `MulticlassLabelsEncoder` class for multiclass tasks. The properly encoded Labels are then dispatched to the `train_machine` method. When `apply` is called the returned Labels are mapped back to the user input Labels space using the `LabelEncoder`.

The tasks (in order):
- [x] write a `LabelEncoder` base class and respective `BinaryLabelEncoder` and `MulticlassLabelsEncoder` derived classes. These should also check that the Labels are valid, e.g. cannot transform {-1, 0, 1} to BinaryLabels. #5067 
- [ ] add `LabelEncoder` as a `Machine` class member
- [ ] fit the `LabelEncoder` and transform input in `train` and then perform inverse operation in `apply`
- [ ] Remove label checks from `Machine` subclasses, since algorithms are now guaranteed to receive a valid Label representation
- [ ] xvalidation would use its own mapping that it passes on to each fold's `Machine` in order to keep the same mapping across folds

Most of this code already exists, but it is spread around the code base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Label assertion and mapping in Machine #5054

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Label assertion and mapping in Machine #5054

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions