E-Commerce Product Sentiment analysis - breakdown.
Data Loading and Exploration: Load the dataset using pd.read_csv. Display sample data and check for missing values. python Copy code
df = pd.read_csv('ecommerce_reviews.csv')
print("Sample Data:") print(df.head())
print("\nMissing Values:") print(df.isnull().sum()) Sentiment Distribution Visualization: Visualize the distribution of sentiments using a bar plot. python Copy code
sentiment_counts = df['Sentiment'].value_counts() sentiment_counts.plot(kind='bar', title='Sentiment Distribution') plt.xlabel('Sentiment') plt.ylabel('Count') plt.show() Text Preprocessing and Vectorization: Use CountVectorizer for text preprocessing. Transform the text data into a numerical format for machine learning. python Copy code
vectorizer = CountVectorizer(stop_words='english') X = vectorizer.fit_transform(df['ReviewText']) Train-Test Split and Naive Bayes Classification: Split the data into training and testing sets. Train a Naive Bayes classifier. python Copy code
X_train, X_test, y_train, y_test = train_test_split(X, df['Sentiment'], test_size=0.2, random_state=42)
classifier = MultinomialNB() classifier.fit(X_train, y_train) Model Evaluation: Make predictions and evaluate the model using accuracy, confusion matrix, and classification report. python Copy code
y_pred = classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred) conf_matrix = confusion_matrix(y_test, y_pred) classification_rep = classification_report(y_test, y_pred)
print("\nModel Evaluation:") print(f"Accuracy: {accuracy}") print("\nConfusion Matrix:") print(conf_matrix) print("\nClassification Report:") print(classification_rep) Word Clouds for Positive and Negative Sentiments: Generate and display Word Clouds for positive and negative sentiments. python Copy code
positive_reviews = ' '.join(df[df['Sentiment'] == 'positive']['ReviewText']) negative_reviews = ' '.join(df[df['Sentiment'] == 'negative']['ReviewText'])
wordcloud_positive = WordCloud(width=800, height=800, background_color='white', stopwords='english', min_font_size=10).generate(positive_reviews) wordcloud_negative = WordCloud(width=800, height=800, background_color='white', stopwords='english', min_font_size=10).generate(negative_reviews)
plt.figure(figsize=(8, 8), facecolor=None) plt.imshow(wordcloud_positive) plt.axis("off") plt.tight_layout(pad=0) plt.title('Word Cloud - Positive Sentiments') plt.show()
plt.figure(figsize=(8, 8), facecolor=None) plt.imshow(wordcloud_negative) plt.axis("off") plt.tight_layout(pad=0) plt.title('Word Cloud - Negative Sentiments') plt.show() This script combines data exploration, machine learning classification, and visualization to analyze sentiments in E-Commerce product reviews. You can further refine and expand the analysis based on your specific requirements and the characteristics of your dataset.