Skip to content

Film2549/Data-Analysis-of-a-Simulated-Marketing-Business-Case-Using-Python-SQL-and-Power-BI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Analysis of a Simulated Marketing Business Case Using Python, SQL and Power BI

Overview

This project is inspired by a Youtube tutorial, where I apply and adapt the skills I've learned to analyze data from a simulated business scenario with a dataset made up of several tables. The goal is to solve three major business problems by finding insights and proposing possible solution. The process involves using data analysis skills and tools, including SQL (via SQL Server), Python, and PowerBI to analyze, clean and visualize the data.

This project helps me understand the process, learn how to use programming languages in working application, and identify my mistakes to improve. It also shows me how data analysis can help solve real-world business challenges.

Data source

given in Data source folder ( as .bak file ).

Built With

Table Of Contents

Introduction to Business Problem

Suppose that there are Simulated Market named ShopEasy, an online retail business, is facing reduced customer engagement and conversion rates. They are reaching out to you to help conduct a detailed analysis and identify areas for improvement in their marketing strategies.

image

From reading above, I set that KPIs are

  • Conversion Rate : Percentage of website visitors who make a purchase.
  • Customer Engagement Rate : Level of interaction with marketing content (clicks, likes, comments)
  • Customer Feedback Score : Average rating from customer reviews.

So I provided my goals are

  • Increase Conversion Rates : Identify factors impacting the conversion rate and provide recommendations to improve it.
  • Enhance Customer Engagement : Determine which types of content make the highest engagement.
  • Improve Customer Feedback Scores : Understand common themes in customer reviews and provid actionable insights.

Data Model

Defining an effective data structure in a dashboard is important. The images below show tables used in analysis.

image

Tables used in this model

  • customers - Data related to each individual customers, their CustomerID, Age, etc.
  • products - Data related to each individual products, their ProductID, price, etc.
  • customer_journey - Data of every Jouney recorded by their CustomerID, ProductID, action such as View and Click etc.
  • customer_reviews - Every review recorded and their reviewText.
  • engagement_data - Data of every engagement relate to their ContentType.
  • geography - GeographyID reference for mapping countries to their capital cities.

Query Data Using SQL

Open and connect to my local database on my computer using Microsoft SQL Server Management Studio, then restore the .bak file and have database named PortfolioProject_MarketingAnalytics. Write an SQL statement for each table based on requirements. Below is a summary of the work I have completed.

image

  • customers : Query to join dim_customers with dim_geography to enrich customer data with geographic information. The new table is dim_customers.
  • products : Query to categoriz products based on their price. The new table is dim_products.
  • customer_journey : Query to clean table by removing duplicates, handle missing values by replacing them with calculated averages then get a new table named fact_customer_journey.
  • customer_reviews : Query to clean whitespace issues in the reviewText column. The new table is fact_customer_reviews.
  • engagement_data : Query for Extracting and splitting combined metrics (Views and Clicks), formatting dates for uniformity, and standardizing values. The new table is fact_engagement_data.

You can see full codes in SQL Queries folder.

Sentiment Analysis Using Python

In the fact_customer_reviews table, there are various review comments. I applied Python libraries to classify and categorize the comments into distinct categories. The nltk library and SentimentIntensityAnalyzer are important in classification of Sentiment. Additionally, pandas and pyodbc also play crucial roles.

image

Steps

  1. Import Libaries : Pandas for data manipulation, pyodbc for database connection, and nltk for sentiment analysis.
  2. Fetch Data from SQL Server : Connect to a Microsoft SQL Server database (PortfolioProject_MarketingAnalytics). Run a SQL query to fetch customer review data into a Pandas DataFrame.
  3. Calculate SentimentScore : Initialize VADER Sentiment Analyzer, then analyze the ReviewText using VADER and calculate a compound sentiment scores (ranging from -1 to 1). Add SentimentScore column to the Dataframe.
  4. Categorize Sentiments : Combine the sentiment score and the review's numerical Rating to classify each review into categories like Positive, Negative, Mixed Positive, etc. Add SentimentCategory column to the Dataframe.
  5. Bucket Sentiment Scores : Group a Sentiment score into predefine text ranges such as 0.5 to 1.0 (Strongly Positive), 0.0 to 0.49 (Mildly Positive), etc. Add SentimentBucket column to the Dataframe.
  6. Save the updated DataFrame to a new CSV file named fact_customer_reviews_with_sentiment.csv.

you can view full code in Python Code folder.

Building an Interface Dashboard in Power BI

Start making an Interface by importing every modified table before to Power BI, Then Making an Calender table for connect evey table having Date column ( Dax code in Power BI Dashboard folder. ).

Set the Datatype for each columnn correctly to get the relationship of all tables as following :

image

Creating a dashboard to display all Key Performance Indicators(KPIs), and complete with filtering options for years, months, and product name.

Overview Page

Provides a comprehensive snapshot of key performance indicators (KPIs) such as conversion rate, social media engagement, and customer reviews.

image

Conversion Page

Analyze the conversion by differnt action and product-specific perfomance.

image

Social Media Page

Tracks social media important engagement result and the impact of content types.

image

Customer Review Page

Evaluate Customer feedback and rating to different products to find actionable insight.

image

.pbix file and dashboard screen shot are in Dashboard folder.

Data Analysis

Here is a brief analysis and summary following provided goals above. The full presentation slide are Data Presentation Full.

image

  1. Increase Conversion Rates

January recorded the highest overall conversion rate at 17.3%, driven significantly by the Ski Boots.

Suggesttion : Focus marketing efforts on products that show high conversion rates, such as Kayaks, Ski Boots, and Baseball Gloves. Implement seasonal promotions during peak months (e.g., January and September) to make these trend significantly more profitable.

image

  1. Enhance Customer Engagement

Blog content drove the most views, especially in April and July, while social media and video content maintained steady.

Suggestion : Restore Content Strategy. Experiment with more engaging content formats, such as blog content, particularly during historically lower-engagement months (September-December).

image

  1. Improve Customer Feedback Scores

The majority of customer reviews are in the higher ratings, with 431 reviews at 4 stars and 409 reviews at 5 stars, indicating overall positive feedback.

Suggestion : The existence of mixed positive and mixed negative sentiments suggests that there are opportunities to convert those mixed experiences into more clearly positive ones.

image

Credits

Most of contents in this project was created by Jirapat Chaiya, a Computer Engineering student at the Faculty of Engineering, Chulalongkorn University.

This project is inspired and guided by Ali Ahmad Special thanks to him for his valuable insights and tutorials.

Releases

No releases published

Packages

No packages published

Languages