Our team's solution to a Machine Learning Comp hosted by American Express.
- subsample_data.py: Randomly selects CIDS and all of their monthly statements from the provided data
- shrinkData.py: Averages the monthly statements of each CID and returns a specified amount of that data
- preprocess.py: Performs feature engineering as described in report
- amex_metric.py: Contains the calculations for the metric used for scoring in the competition
- modelTrain.py: Trains the three models specified within report (XGBoost, Catboost, LightGBM), providing CV score and saves these models
- predictLgb.py: Loads provided lgb model and makes prediction on the provided data
- predictXgb.py: Loads provided Xgb model and makes prediction on the provided data
- predictCb.py: Loads provided Cb model and makes prediction on the provided data
- lgbmTuning.py: Tunes the hyperparameters of a LGBM model
- xgbTuning.py: Tunes the hyperparameters of a LGBM model
- cbTuning.py Tunes the hyperparameters of a LGBM model
- permutation_importance.py: Permutation importance to improve on dataset processing