This project is designed to classify YouTube comments as toxic or non-toxic using BERT (Bidirectional Encoder Representations from Transformers). By fine-tuning a pre-trained BERT model, we leverage state-of-the-art NLP capabilities to identify harmful content in online conversations.
The model is trained to identify toxicity, which is crucial for creating safer and more respectful online platforms. This project can be extended for different toxicity levels or used in moderation tools.
Here's a quick demo of the toxicity classification tool:
- Dataset
- Installation
- Data Preprocessing
- Model Architecture
- Training
- Evaluation
- Results
- Usage
- Future Improvements
- License
The dataset consists of YouTube comments with corresponding labels indicating whether the comment is toxic (1
) or non-toxic (0
). This binary classification problem is aimed at improving content moderation in online discussions.
- Columns:
comment_id
: Unique identifier for each comment.content
: Text of the comment.label
:1
for toxic comments,0
for non-toxic comments.
To get started with this project, follow these instructions:
- Python 3.7+
- PyTorch 1.6+
- Hugging Face Transformers library
- CUDA-enabled GPU (optional but recommended)
- Clone the Repository:
git clone https://github.com/your-repository/youtube-toxic-comment-classification.git cd youtube-toxic-comment-classification
The raw comments are preprocessed before being fed into the BERT model. This includes:
- Removing URLs, special characters, and extra spaces.
- Converting all text to lowercase.
- Tokenizing using BERT tokenizer, which converts the text into input tokens compatible with BERT.
BERT (Bidirectional Encoder Representations from Transformers) This project uses BERT to handle the NLP task of toxic comment detection. BERT is a transformer-based model that understands the context of words in sentences, making it highly effective for text classification tasks.
Tokenizer: Converts sentences into token IDs.
BERT for Sequence Classification: Pre-trained BERT model fine-tuned for binary classification.
After fine-tuning BERT, we achieved the following performance metrics:
1.) Accuracy: ~90%
2.) F1 Score: ~0.85
3.) Precision: ~0.88
4.) Recall: ~0.83
1.) Multi-label Classification: Extend the model to classify different types of toxicity (e.g., hate speech, threats, etc.).
2.) Data Augmentation: Generate synthetic examples to address class imbalance.
3.) Model Optimization: Experiment with models like DistilBERT for faster performance.
4.) Multi-language Support: Expand to detect toxicity in different languages.
This project is licensed under the MIT License. See the LICENSE file for more details.