Skip to content

Sabaudian/AMD_Market_Basket_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Algorithms for Massive Datasets - Project 2: Market-basket analysis

Python Badge Apache Spark Badge Open in Colab

Summary

The task is to implement from scratch a system finding frequent itemsets (aka market-basket analysis), considering each movie as a basket and the actors as items.

Introduction

Market-basket analysis was originally employed by retailers to find out items relationship among the customers transactions, with the main goal of reveling products that are often brought together, optimizing product placement and proposing targeted offers to clients. Today, this technique is employed in a variety of applications, such as performing fraud detection, understanding customer behavior under different conditions, and in healthcare, where it is used to identify the relationship between different diseases and symptoms. In general terms, it represents a many-to-many association between two kinds of entities. This study focuses on finding frequent itemsets by working on a dataset that collects various information about movies, treating movies as baskets and actors as items. To achieve the intended goal, two algorithms were implemented from scratch: the A-priori algorithm and the algorithm of Park, Chen, and Yu (PCY).

Releases

No releases published

Packages

No packages published