This project, titled “Strategic Data Insights: Evaluating Target’s Performance in Brazil,” presents a comprehensive analysis of operational data from Target's Brazilian market. The dataset spans from 2016 to 2018 and includes 100,000 orders, split into eight CSV files covering customers, sellers, order items, geolocation, payments, orders, products, and customer reviews.
The aim is to derive valuable insights through multi-level queries, covering a wide range of business aspects such as order processing, pricing, payment methods, shipping performance, customer demographics, product attributes, and customer satisfaction. These insights can help enhance Target's strategic decision-making in the Brazilian market, leading to improved business performance and customer satisfaction.
The dataset contains the following eight CSV files:
customers.csv sellers.csv order_items.csv geolocation.csv payments.csv orders.csv products.csv customer_reviews.csv This dataset covers 100,000 orders between 2016 and 2018, providing a detailed view of the business's operational performance.
List all unique cities where customers are located. Count the number of orders placed in 2017. Find the total sales per category. Calculate the percentage of orders that were paid in installments. Count the number of customers from each state.
Calculate the number of orders per month in 2018. Find the average number of products per order, grouped by customer city. Calculate the percentage of total revenue contributed by each product category. Identify the correlation between product price and the number of times a product has been purchased. Calculate the total revenue generated by each seller, and rank them by revenue.
Calculate the moving average of order values for each customer over their order history. Calculate the cumulative sales per month for each year. Calculate the year-over-year growth rate of total sales. Calculate the retention rate of customers, defined as the percentage of customers who make another purchase within 6 months of their first purchase. Identify the top 3 customers who spent the most money in each year.
The analysis was performed using the following technologies and tools:
Python: For data manipulation and analysis. Pandas: For data processing and cleaning. NumPy: For numerical operations. Matplotlib & Seaborn: For data visualization. SQL: For querying the dataset to extract insights. Jupyter Notebooks: To document and present the analysis in an interactive format. Excel: Used for initial data exploration and organization.
Integrate machine learning models to predict customer retention rates and identify key factors driving repeat purchases. Enhance the dashboard to visualize real-time sales and performance metrics. Explore more advanced analytics techniques such as customer segmentation and product recommendation systems.