Skip to content

MeganAlbee/knn-audience-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

K-Nearest Neighbors Classification

This project was a part of the Masters of Science in Business Analytics at the University of Montana.

Project Intro/Objective

This project aims to use K-Nearest Neighbors (KNN) to classify a local tavern's customer base. These classifications can be used to target audiences for marketing purposes to increase RIO, retain customers and learn more business insights.

Project Description

While SKlearn in Python is very robust, this project uses the Euclidean Distance method in Python by hand to classify an audience. After classifying the audience, a segmented dataset was used to compare the accuracy of my segmentation versus the correct set.

Plots made with Seaborn were used to help understand the accuracy and the dataset. In addition, data visuals like this help understand the broader concept of the segments.

Key learnings

  1. Using KNN to segment customer audiences.

  2. Foundational understanding of KNN and how it works by hand.

  3. KNN comes with limitations but is still powerful with smaller datasets.

About the Dataset:

  • customer_id: a unique ID for the customer
  • most_popular_category: the most popular drink category for the customer
  • relationship_days: the number of days between the first and last transaction for the customer.
  • total_spend: the lifetime total spend by the customer
  • beverage_categories: the number of distinct beverage categories the customer has purchased from.
  • segment: a four-level categorical variable that segments the customers.

About

Using K-Nearest Neighbors Algorithm (without SKlearn) in Python to develop customer audience segments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published