LetterboxdML

My individual final project for STA4241 - Statistical Learning in R. In this project, I used my personal movie-watching data from Letterboxd.com to train machine learning models--specifically, classification algorithms including ordered logistic regression, K-nearest neighbors, support vector machines, etc.--and evaluate their predictive performance on my movie ratings.

Dataset

Letterboxd allows you to export your personal data directly from the website, but the information is limited to the movies you watched, the day you watched them, and what rating you assigned them out of five stars. To obtain more relevant covariates for model training, I wrote a web-scraping script using the rvest package to pull info such as Average Rating, Runtime, and Genre from the website. This script is named lbxd_scraper.R and can be found in the main directory.

Analysis

All of my variable selection and model training procedures can be found in the lbxd_analysis.R script, which is broken up into various simulation studies based on the data and/or covariates being used. These simulation studies are loosely organized, but the results of my simulations are all gathered and reported in the final report.

Report

The final report that I submitted is titled sta4241_final.pdf, and contains a full, detailed account of my data exploration and analysis as well as visualizations and a discussion of the results. I also made a PowerPoint presentation with the same findings for use in my oral presentation of the project, found under Letterboxd Data Analysis.pptx (there is also a PDF version).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
letterboxd-data		letterboxd-data
.Rhistory		.Rhistory
.gitattributes		.gitattributes
FinalProject.pdf		FinalProject.pdf
Letterboxd Data Analysis.pdf		Letterboxd Data Analysis.pdf
Letterboxd Data Analysis.pptx		Letterboxd Data Analysis.pptx
README.md		README.md
errorAll.dat		errorAll.dat
errorAvgRating.dat		errorAvgRating.dat
errorStepwise.dat		errorStepwise.dat
full-diary.csv		full-diary.csv
full-watchlist.csv		full-watchlist.csv
lbxd_analysis.R		lbxd_analysis.R
lbxd_scraper.R		lbxd_scraper.R
sta4241_final.Rmd		sta4241_final.Rmd
sta4241_final.pdf		sta4241_final.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LetterboxdML

Dataset

Analysis

Report

About

Uh oh!

Releases

Packages

Languages

nickwibert/LetterboxdML

Folders and files

Latest commit

History

Repository files navigation

LetterboxdML

Dataset

Analysis

Report

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages