Skip to content

This repository contains Python code that implements a trust-based collaborative filtering approach for enhancing rating predictions in recommendation systems. The methodology is based on the research paper titled "Novel Implicit-Trust-Network-based Recommendation Methodology" by Reza Barzegar Nozari and Hamidreza Koohi.

License

Notifications You must be signed in to change notification settings

RezaBN/ITNRM-Implicit-Trust-Network-based-Recommendation-Methodology

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This README provides an overview of the ITNRM methodology and how to get started with the code. Researchers and developers can use this repository as a starting point to implement and experiment with the ITNRM approach in their recommendation systems.


Implicit-Trust-Network-based Recommendation Methodology (ITNRM)

Introduction

Welcome to the official repository for the "Implicit-Trust-Network-based Recommendation Methodology" (ITNRM) developed by Reza Barzegar Nozari and Hamidreza Koohi [1]. ITNRM is a groundbreaking recommendation methodology designed to enhance recommender systems by leveraging a novel implicit trust-network construction method. This approach introduces significant innovations in terms of similarity estimation, confidence measurement, and opinion alignment assessment, ultimately aiming to improve recommendation accuracy.

Citation: If you use ITNRM in your research or project, please cite the original papers:

  • [1] Reza Barzegar Nozari and Hamidreza Koohi. "Novel Implicit-Trust-Network-based Recommendation Methodology." Expert Systems with Applications 186 (2021): 115709. (https://doi.org/10.1016/j.eswa.2021.115709)
  • [2] Reza Barzegar Nozari and Hamidreza Koohi. "An Implicit Trust-Network construction approach and a recommendation methodology for recommender systems." Software Impacts 12 (2022): 100242. (https://doi.org/10.1016/j.simpa.2022.100242)

Key Contributions

Our research presents several key contributions to the field of recommender systems:

  1. Asymmetric Similarity Metric: We introduce a novel asymmetric similarity metric based on the Pearson correlation coefficient and a distance metric, providing a more refined measure of user similarity.

  2. Confidence Measurement: We define a new method for quantifying user confidence, enhancing our understanding of the reliability of user interactions.

  3. Identical Opinion Estimation: We propose a novel method to estimate the degree of identical opinion between users, shedding light on the alignment of user preferences.

  4. Trust Network Construction: We formulate a unique approach to determine trust levels between users and construct a trust network using the metrics of similarity, confidence, and identical opinion.

  5. Recommendation Strategy: We develop a novel recommendation strategy that incorporates the trust network, leading to more personalized and accurate recommendations.

ITNRM Methodology Overview

The ITNRM methodology consists of two main parts:

A) Implicit-Trust-Network Construction Method

  1. Incipient Trust Network Formation: In the first phase, an incipient implicit-trust-network is constructed by combining the novel similarity measure, confidence measure, and identical opinion measure, all introduced in our research.

  2. Trust Network Evaluation and Reconstruction: In the second phase, the validity of the incipient implicit-trust-network is evaluated and then reconstructed into a telic implicit-trust-network, ensuring the trust network's accuracy.

B) Combination Ranking Method for Recommendation

  1. Predicting Item Ratings: We predict item ratings based on the user's trust network.

  2. Scoring Items: Items are scored considering the target user's trust network.

  3. Ranking Items: Items are ranked both by predicted ratings and scores separately.

  4. Final Ranking: A final ranking is generated by combining the two rankings, and high-ranking items are recommended to the target user.

For a more detailed explanation of the various steps of the ITNRM approach, please refer to the published article, "Novel Implicit-Trust-Network-based Recommendation Methodology" [1].

Code Overview

This Python script is designed to showcase the steps involved in implementing and evaluating a trust-based collaborative filtering algorithm for rating prediction based on the paper [1]. Below is a comprehensive breakdown of processes and functions in the code:

Overall Processes Steps

  1. Data Preprocessing:

    • Load the input data from a CSV file (Data.csv) that contains user-item ratings.
    • Extract unique user and movie IDs from the data.
    • Set the maximum and minimum rating values (MaxRating and MinRating).
  2. User-Item Matrix Creation:

    • Create a user-item matrix using the loaded data. The matrix represents users' ratings for different items.
  3. Calculating Similarity, Confidence, and Identical Opinion Matrices:

    • Calculate Pearson Correlation Coefficient (PCC) matrix, which measures the similarity between users' rating patterns.
    • Calculate distance matrix, considering the Euclidean distance between users' rating vectors.
    • Calculate identical opinion matrix, measuring the similarity of opinions between users.
  4. Generating Incipient Trust Network:

    • Combine the similarity, confidence, and identical opinion matrices to generate an incipient trust network.
    • This network represents the initial trust levels between users based on their similarity, confidence, and opinion alignment.
  5. Reconstructing Incipient Trust to Telic Trust:

    • Propagate trust values in the incipient trust network using a formula based on direct trust.
    • Calculate telic trust values by combining the propagated trust values with trustees' precision.
  6. Preparing Evaluation Data:

    • Split the user-item matrix into training and test sets using K-Fold cross-validation (K=5).
    • Convert the training and test data into numpy arrays.
    • Create a trust matrix for the test trustees.
  7. Evaluating Trust Network Performance:

    • For each test user, calculate the predicted ratings based on their trustees' ratings and weights.
    • Compare predicted ratings with actual ratings to evaluate the trust network's performance.
    • Calculate precision, recall, mean absolute error (MAE), and root mean squared error (RMSE) metrics.
  8. Printing Results:

    • Display the runtime of the code execution.
    • Print the evaluation results, including precision, recall, MAE, and RMSE.

Please note that the current version of the code exclusively focuses on the implementation of the proposed method for constructing trust relationships and the relevant metrics, as described in the academic paper. The recommendation methodology part proposed in the paper is not included in this code version.

Explanation of Main Functions

  1. create_user_item_df(df)

    • Role: This function takes as input a DataFrame (df) containing user-item interactions (e.g., user ratings for movies) and transforms it into a user-item matrix.
    • Functionality: It pivots the input DataFrame to create a user-item matrix where rows represent users, columns represent items, and the matrix elements represent user-item interactions (ratings). It fills in missing values with zeros.
  2. calculate_distance(user_item_matrix)

    • Role: This function calculates the distance matrix between users based on their item ratings.
    • Functionality: It computes pairwise distances between users using Numba-compiled functions for efficient parallel processing. The distance is calculated based on the intersection of items rated by both users and the squared differences in their ratings.
  3. calculate_similarity(user_item_matrix)

    • Role: This function calculates the similarity matrix between users.
    • Functionality: It uses the output from the calculate_distance function to compute the similarity matrix, which measures the similarity in item ratings between users. The function also calculates a similarity threshold for each user.
  4. calculate_confidence(user_item_matrix, similarity_dict)

    • Role: This function calculates a confidence matrix reflecting the trust level between users.
    • Functionality: It uses the similarity information obtained from the calculate_similarity function to compute a measure of confidence in user relationships. The function determines the count of neighbors and common rated items between users to establish trust levels.
  5. calculate_identical_opinion(user_item_matrix, similarity_dict, confidence_dict)

    • Role: This function calculates a matrix reflecting the extent to which users share identical opinions.
    • Functionality: It measures the sameness of opinions between users using Numba-compiled functions. The function considers the similarity, confidence, and item ratings when assessing the sameness of opinions.
  6. generate_incipient_trust(similarity_dict, confidence_dict, identical_opinion_dict)

    • Role: This function constructs an initial trust network among users.
    • Functionality: It combines the similarity, confidence, and identical opinion information to create an incipient trust network. This network represents trust relationships between users based on their interactions and shared opinions. The network is further evaluated and reconstructed to form a more reliable telic trust network.
  7. reconstruct_incipiet_trust_to_telic(user_item_matrix, incipient_trust_dict)

    • Role: This function reconstructs the incipient trust network into a telic trust network, considering trustees' precision.
    • Functionality: It evaluates the precision of trustees' recommendations for item ratings and uses this information to create a telic trust network. The telic trust network reflects more accurate trust relationships between users.
  8. prepare_evaluation_data(user_ids, user_item_df, telic_trust_dict)

    • Role: This function prepares data for evaluating the performance of the trust network in rating prediction.
    • Functionality: It splits the user-item data into training and test sets using K-Fold cross-validation. Additionally, it extracts the trust matrix for the test set.
  9. evaluate_trust_network(train_data, test_data, test_trustees, metrics_thr)

    • Role: This function evaluates the performance of the trust network in rating prediction.
    • Functionality: It predicts item ratings for the test users based on trust and evaluates the quality of predictions using precision, recall, mean absolute error (MAE), and root mean square error (RMSE).
  10. main()

    • Role: The main function that orchestrates the execution of all the above functions.
    • Functionality: It loads the user-item interaction data, applies the ITN methodology, and evaluates its performance in predicting users’ preferences. The results, including precision, recall, MAE, RMSE, and runtime, are displayed.

These functions collectively implement the Implicit-Trust-Network-based Recommendation Methodology (ITNRM) as described in the academic paper by Reza Barzegar Nozari and Hamidreza Koohi. ITNRM leverages trust networks among users to improve the accuracy and personalization of recommendations in collaborative filtering recommender systems. Each function plays a crucial role in the construction of trust networks and the generation of recommendations.

Getting Started

To use ITNRM in your own projects or research, follow these steps:

  1. Clone this repository to your local machine.
  2. Create Virtualenv.
  3. Install requirements by command: pip install -r requirements.txt.
  4. Prepare your dataset or use the provided MovieLens 100K dataset (Data.csv).
  5. Customize the code to adapt it to your dataset and requirements.
  6. Run the script.

Important Notes

This code is designed for use with the MovieLens 100K dataset, which consists of 944 users providing 100,000 ratings to 1,968 items. The code includes functions that operate on full matrices of user-item interactions (containing user ratings for items) and user-user relationships (including similarity, confidence, identical opinions, and trust among users). It is important to note that when running this code with large datasets, you may encounter memory errors and experience significant time-consuming issues.

We have made efforts to optimize runtime by leveraging NumPy's features and utilizing Numba's functionality in some cases. Given our system's specifications (16.0 GB RAM and an Intel Core i7 CPU), the current version requires approximately 4 minutes to complete all processes. But it's important to keep in mind that memory usage can still be a concern when working with large datasets

To address potential memory and performance issues in larger datasets, please consider the following suggestions:

  1. Reduce Memory Usage:

    • Examine whether you can optimize your code to use smaller data structures.
    • Avoid creating large intermediate matrices if possible.
    • Use memory-efficient data types.
  2. Memory Management or Garbage Collection:

    • Delete matrices that are no longer needed in your code to free up memory.
    • Reducing memory usage can help prevent memory errors and improve running time.
  3. Batch Processing:

    • Break down calculations into smaller batches to reduce memory usage.
  4. Sparse Matrices:

    • If your data is sparse (mostly containing zero values), consider using sparse matrix representations, such as scipy.sparse, to significantly reduce memory consumption.
  5. Parallelization:

    • Utilize parallel processing to distribute computations across multiple CPU cores, which can speed up calculations and reduce memory usage.
  6. Optimize Algorithms:

    • Review the scripts for opportunities to reduce complexity or optimize calculations to minimize memory usage.
  7. Consider a Server/Cloud Solution:

    • If the computations are too memory-intensive for your local machine, explore options such as cloud-based solutions or servers with higher memory capacities.

We welcome and appreciate any solutions or edited versions you may develop to improve memory efficiency and overall performance when working with large datasets. Collaboration and contributions are encouraged to enhance the code and the proposed methods.

Thank you for using this code, and we look forward to your feedback and contributions.


For any questions or inquiries, please contact the authors of the original paper.

About

This repository contains Python code that implements a trust-based collaborative filtering approach for enhancing rating predictions in recommendation systems. The methodology is based on the research paper titled "Novel Implicit-Trust-Network-based Recommendation Methodology" by Reza Barzegar Nozari and Hamidreza Koohi.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages