Skip to content

The aim of this research is to predict how many of Fresco's loyalty cardholders can be categorized as low spenders (£50 or less per week) and high spenders (£50+). It will help the marketing team identify customer segments with which they should implement targeted promotional strategies

Notifications You must be signed in to change notification settings

SuryaPrakashJ123/Logistic_Regression_Analysis_Fresco-s_loyalty_cardholders-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fresco Supermarket Customer Spend Classification

Project Overview

This project analyzes weekly transaction data from Fresco Supermarket, one of the UK’s largest grocery retailers. The aim is to identify patterns and predictors of customer basket value, classifying shoppers as either Low Spenders (£50 or less per basket) or High Spenders (over £50), using a range of business analytics tools.

Dataset Description

  • Source: Fresco Supermarket Loyalty Cardholder Weekly Data (26-week period)
  • Sample: Randomly selected loyalty cardholders across three channels: convenience stores, superstores, and online platform
  • Variables:
    • Gender: Customer gender (Male/Female)
    • Age: Customer age in years
    • Store_Type: Type of store (Convenience, Superstore, Online)
    • Shopping_Frequency: Number of shopping visits per week
    • Basket_Value: Total basket spend (£)
    • Basket_Consistency: Predominant product type (e.g., Value, Brand, Fresco Top)
    • HighSpender: Target binary variable: 1 = High Spender (>£50), 0 = Low Spender (≤£50)
  • File: Portfolio-Task-1-Short Data_Fresco1.xlsx

Part A – Executive Summary (for Head of Marketing)

Aim & Objectives

  • Classify Fresco customers as Low or High Spenders based on demographic and behavioural data.
  • Identify key predictors of high-value spending for targeted marketing.

Approach

  • Explored and cleaned the dataset for analysis.
  • Applied logistic regression to predict HighSpender status.
  • Evaluated model performance using accuracy, R², and classification tables.

Key Results

  • Classification Accuracy: The model correctly classified 94.7% of all customers.
    Classification Table
  • Model Fit: Strong model fit (Nagelkerke R² = 0.873).
    Model Summary Table
  • Statistical Significance: The model is highly significant (Omnibus test, p < 0.001).
    Omnibus Test of Model Coefficients

Recommendations

  • Target high-frequency shoppers and customers with consistent preferences for premium (Fresco Top) or branded products for upselling and loyalty campaigns.
  • Utilize customer age and store type in segment-specific promotions.
  • Monitor shopping frequency as a key predictor of spending behaviour.

Part B – Technical Analysis

1. Data Preparation

  • Loaded dataset from Excel (Portfolio-Task-1-Short Data_Fresco1.xlsx).
  • Inspected for missing values and outliers. Minimal cleaning was required; outliers in basket value were retained to preserve spend variation.
  • Engineered HighSpender as the binary target variable (1 if Basket_Value > £50; else 0).

2. Method Selection & Justification

  • Logistic regression was chosen for its interpretability and suitability for binary outcomes.
  • Alternative models (e.g., decision trees) were considered, but logistic regression provided clear coefficient interpretations and robust diagnostics.

3. Model Fitting & Assumptions

  • Fitted a logistic regression model using predictors: Age, Gender, Store_Type, Shopping_Frequency, and Basket_Consistency.
  • Assumptions checked:
    • Linearity of logit for numeric predictors (age, frequency)
    • No perfect multicollinearity detected among predictors
    • All input variables were categorical or continuous, as required

4. Model Outputs

  • Classification Table:
    Classification Table
    Overall accuracy: 94.7%. Sensitivity and specificity both above 93%.
  • Omnibus Test:
    Omnibus Test
    Model is highly significant (p < 0.001).
  • Model Fit:
    Model Summary
    Cox & Snell R² = 0.650, Nagelkerke R² = 0.873 (very strong model fit).

5. Coefficient Interpretation

  • Shopping frequency was the strongest positive predictor: each additional visit per week increased the odds of being a High Spender.
  • Age showed a weak association with spending, but younger customers were slightly more likely to be high spenders when controlling for other factors.
  • Store type and basket consistency (premium brands or Fresco Top products) also contributed positively to HighSpender classification.

6. Model Diagnostics

  • Classification accuracy: 94.7% (see table above).
  • Very high R² values for a logistic regression, suggesting strong explanatory power. However, this may reflect dataset characteristics (sample, feature engineering, or possible overfitting on small subsample).
  • Confusion matrix shows balanced sensitivity and specificity (false positives and negatives are minimal).

7. Limitations & Assumptions

  • Sample size is modest for generalization—results are robust for the subsample but may require validation on larger or full datasets.
  • Some variables (such as basket consistency) are self-reported or categorical; future work could further refine these with more granular product data.
  • Model assumes stability of customer behaviour over the observed period; seasonal effects or promotions not modeled here.

8. Technical Conclusion

  • This analysis demonstrates that Fresco can accurately predict high-value spenders using standard loyalty cardholder data.
  • Logistic regression provides interpretable and actionable insights for marketing management, especially around frequency and channel preference.
  • Further improvements could involve testing ensemble models or time-series features as more data becomes available.

About

The aim of this research is to predict how many of Fresco's loyalty cardholders can be categorized as low spenders (£50 or less per week) and high spenders (£50+). It will help the marketing team identify customer segments with which they should implement targeted promotional strategies

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published