As this is an industry collaboration with Bergens Tidende, no data is, nor will be published in this repository. This repository concerns itself with the Exploratory Analysis, Offline Evaluation and the evaluation of Online Evaluation. For code related to the implementation of the user study, see below:
The code found in this repository was developed as part of my Master´s Thesis Personalized News Recommendation in the Sports Domain. It is organized into four directories: exploratory_analysis
, helpers
, models
and online_evalutation
. In the first directory one can find notebooks performing the Exploratory Analysis for the thesis, with separate notebooks for each embedding model. The helpers
directory contains helper-methods and tools used for processing and evaluation. The models
directory contains the implementation of all models evaluated in the Offline Evaluation, with separate notebooks for each model.
The online_evaluation
directory contains a script for running the performed binomial test, a script for producing plots and two versions of the dataset collected from the study responses. user_study_unfiltered_data.json
consists of all responses in their unfiltered form, i.e. without the attention check filter. The second dataset user_study_data.json
(the one used in the thesis), is a filtered version, with all responses which failed the attention checks filtered out.
To run code, install the requirements found in requirements.txt
in a virtual environment:
pip install -r requirements.txt