-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
Description
Using some kind of clustering algorithm to predict a class per document. Classes may be genre, topic, usefulness, etc. Finding the closest cluster per document relies on a distance metric.
Objectives
- Implement different clustering algorithms to classify documents into an arbitrary set of classes. Text similarity would be a good starting point as the distance metric utilized.
- Use zero-shot learning (ZSL) to classify documents from a group of pre-determined classes. HuggingFace has a pipeline for that. Checkout the comments in here.
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers