Run scripts/collect_data.py
and scripts/create_samples.py
to collect posts from Reddit and create comment chains. You need your own Reddit client ID and secret which can be generated by creating a Reddit account and then going to https://www.reddit.com/prefs/apps/ and following the steps to create a developer application with the "script" type when selecting the radio buttons.
Install conda env before running LDA notebook:
conda env create -f lda.yml
LIWC File requires/uses the LIWC_2015.dic file.
This needs to be supplied by the user as a license must be purchased to acquire and use this file.
To run the code for the LIWC analysis, you can just run the python liwc script in scripts/liwc_analysis.py
.
This file serves as both a collection of methods for analysis and a playground (at the bottom of the file) to run methods. Some code comments and type-hints are provided in methods to generally explain how the methods work.
NOTE: You will need to define the path variables at the beginning of the script but recommended file paths are provided in the code comments.
Some examples on how to run the various methods for calculating t-test, LIWC category word usage, and Kruskal-Wallis tests are present at the bottom of the file. Feel free to experiment with them at your leisure.