A short project using PySpark. We used the bank marketing dataset from Kaggle. We handle missing values, label and encode categorical data. We scale numeric data and create a Random Forest model.
Dataset available at: https://www.kaggle.com/janiobachmann/bank-marketing-dataset