This project was part of the "Data Analytics Challenge" course at the Catholic University of Eichstätt-Ingolstadt.
Unbalanced data sets are a significant challenge in the field of machine learning and data mining. This is because conventional classification methods generally tend to favor the majority class, which often has many times more instances than the minority class in a binary classification problem. This greatly impairs the classifier's predictive ability to correctly recognize the minority instances, but it can still have a high accuracy. The minority class can also remain completely unrecognized. This issue is particularly important in the detection of fraudulent transactions in banks, credit risk assessment or the detection of firewall intrusions. Therefore, the objective was to tackle the challenge of imbalanced datasets using the Synthetic Minority Oversampling Technique (SMOTE) and a recent adaption of the SMOTE algorithm (ASN-SMOTE: a synthetic minority oversampling method with adaptive qualified synthesizer selection).