This repository contains the code and resources for a machine learning project focused on multi-label text classification. The project involves preprocessing text data, vectorizing the text, preparing tags, training a multi-label classifier, and evaluating the model's performance.
The project is organized into the following sections:
- Importing necessary libraries
- Loading and cleaning the text data
- Text vectorization using CountVectorizer
- Preparing tags for multi-label classification
- Training a multi-label classifier using Naive Bayes algorithm
- Predicting labels for the test dataset
- Evaluating the model's performance using Hamming Loss and Precision
main_notebook.ipynb
: Jupyter notebook containing the code for the machine learning project.
To use this project, simply clone the repository and run the Jupyter notebook main_notebook.ipynb
in a Jupyter environment.