A machine learning project that automatically identifies spam and toxic comments from YouTube using natural language processing and pre-trained transformer models.
This project analyzes YouTube comments to classify them as spam or legitimate content by combining:
- Toxicity detection using a pre-trained transformer model
- Pattern matching for common spam indicators
- Text preprocessing and cleaning techniques
- Automated Text Cleaning: Removes URLs, mentions, hashtags, and special characters
- Dual Classification Approach:
- Toxicity scoring using
martin-ha/toxic-comment-model
- Pattern-based spam detection
- Toxicity scoring using
- Comprehensive Analysis: Statistical reporting and visualization
- Flexible Thresholds: Customizable spam detection sensitivity
pip install transformers torch torchvision
pip install scikit-learn pandas numpy matplotlib seaborn
pip install nltk wordcloud textblob
- Clone this repository
- Install the required packages
- Prepare your YouTube comments dataset in CSV format
- Run the Jupyter notebook
Ensure your CSV file contains a CONTENT
column with the comment text:
CONTENT,category
"Great video! Thanks for sharing",ham
"Subscribe to my channel! Check out www.spam.com",spam
# Load and preprocess data
df = pd.read_csv('Youtube-Spam-Dataset.csv')
df['CLEAN_CONTENT'] = df['CONTENT'].apply(clean_text)
# Initialize the classifier
classifier = pipeline('text-classification', model='martin-ha/toxic-comment-model')
# Generate spam labels
df['CLASS_LABEL'] = create_spam_labels(df)
The classifier outputs:
CLASS_LABEL
: Binary classification (0 = Clean, 1 = Spam)- Statistical summary of spam detection rates
- Visualization charts showing comment distribution
The spam detection algorithm considers a comment as spam if:
- High Pattern Match: 2+ spam patterns detected
- Medium Pattern + Toxicity: 1+ patterns AND toxicity score > 0.3
- High Toxicity: Toxicity score > 0.7
subscribe
,channel
,check out
,follow me
my channel
,visit
,website
- URLs (
.com
,www
)
The clean_text()
function performs:
- Converts to lowercase
- Removes URLs (
http://
,www.
,https://
) - Removes mentions (
@username
) and hashtags (#tag
) - Strips special characters and extra whitespace
- Handles missing/null values
The classifier uses the pre-trained martin-ha/toxic-comment-model
which provides:
- Toxicity probability scores
- Binary toxic/non-toxic classification
- Robust performance on social media content
The project generates:
- Pie Chart: Distribution of clean vs spam comments
- Statistical Summary: Total counts and spam percentage
- Color-coded Results: Green for clean, red for spam
Modify the thresholds in create_spam_labels()
:
# More sensitive detection
if pattern_matches >= 1: # Lower threshold
spam_label = 1
elif toxicity > 0.5: # Lower toxicity threshold
spam_label = 1
Extend the spam patterns list:
spam_patterns = [
r'subscribe', r'channel', r'check out',
r'like and subscribe', # Add custom patterns
r'hit the bell',
r'free download'
]
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement
) - Commit your changes (
git commit -am 'Add new feature'
) - Push to the branch (
git push origin feature/improvement
) - Create a Pull Request
This project is open source and available under the MIT License.
- martin-ha/toxic-comment-model for the pre-trained toxicity classifier
- Hugging Face Transformers for the pipeline infrastructure
- YouTube Spam Dataset contributors
If you encounter any issues or have questions:
- Check the Issues section
- Review the code comments and documentation
- Submit a new issue with detailed information
Note: This classifier is designed for educational and research purposes. Always review automated classifications and consider implementing human moderation for production systems.