A comprehensive recommendation system framework that implements multiple collaborative filtering algorithms with built-in community bias analysis and community edge suppression capabilities. This is the code of the Bachelor Thesis of Leon Orou, the study can be fount here.
pip install -r requirements.txt
python main.py [OPTIONS]
-
--model_name
(default: 'MultiVAE')- Choices: 'LightGCN', 'ItemKNN', 'MultiVAE'
- Description: Specifies which recommendation algorithm to use
-
--dataset_name
(default: 'ml-100k')- Choices: 'ml-100k', 'ml-1m', 'lastfm'
- Description: Dataset to use for training and evaluation
-
--users_top_percent
(default: 0.05)- Range: 0.0 - 1.0
- Description: Percentage of top-connected users to consider as power nodes
-
--items_top_percent
(default: 0.00)- Range: 0.0 - 1.0
- Description: Percentage of top-connected items to consider as power nodes
-
--users_dec_perc_suppr
(default: 0.625)- Range: 0.0 - 1.0
- Description: Percentage of biased user connections to suppress for bias mitigation. Biased edges are usually ~60% of all edges
-
--items_dec_perc_suppr
(default: 0.0)- Range: 0.0 - 1.0
- Description: Percentage of biased item connections to suppress for bias mitigation. Biased edges are usually ~60% of all edges
-
--community_suppression
(default: 0.8)- Range: 0.0 - 1.0
- Description: Strength of community-based edge suppression (higher = more suppression)
-
--suppress_power_nodes_first
(default: 'True')- Choices: 'True', 'False'
- Description: Whether to prioritize suppressing connections of highly-connected nodes
-
--use_suppression
(default: 'True')- Choices: 'True', 'False'
- Description: Enable/disable community bias suppression entirely
The system generates:
- Metrics: NDCG, Recall, Recall, MRR, Hit Rate, Item Coverage, Gini Index, Average Recomended Popularity, Popularity Lift, Popularity Miscalibration, Simpson Index (of item genres), Intra List Diversity (of item genres), Normalized Genre Entropy, Unique Genres Recommended, User Community Bias
- Logs: Generates configuration-, fold- and fold average results logs in logs/ folder