Taming Big Data with Apache Spark and Python - Hands On!
- Tally up amount spent by customer using Spark
- Sort your result by amount spent per customer
- Find similar movies
- Improve the Quality of Similar Movies
- Cosine Similarity (default)
- Pearson Correlation Coefficient
- Jaccard Coefficient
- Euclidean Distance
Used dataset: MovieLens 100k
Link to MovieLens Dataset