Skip to content

niladrem/spark_exercises

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Udemy Course - Spark Exercises

Taming Big Data with Apache Spark and Python - Hands On!

Exercise 01

  • Tally up amount spent by customer using Spark
  • Sort your result by amount spent per customer

Exercise 02

  • Find similar movies
  • Improve the Quality of Similar Movies
Used metrics:
  • Cosine Similarity (default)
  • Pearson Correlation Coefficient
  • Jaccard Coefficient
  • Euclidean Distance

Used dataset: MovieLens 100k
Link to MovieLens Dataset

About

Udemy Course - Spark Exercises

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages