This repository is a comprehensive collection of machine learning projects developed as part of the Mathematical Trading and Finance MSc programme at Bayes Business School (formerly Cass). It showcases a practical journey through the core stages of machine learning – from data preprocessing and exploratory analysis to the development, evaluation, and optimisation of both supervised and unsupervised learning models. The projects are designed to provide hands-on experience with techniques directly applicable to the challenges encountered in mathematical trading and finance, making this repository a valuable resource for students and practitioners in the field.
- Objective: Apply classification algorithms (Decision Trees and Random Forests) to predict secondary school student performance using demographic, social, and school-related features.
- Approach:
- Development of predictive models using decision tree-based methods.
- Analysis of the impact of various features on student performance.
- Presentation:
- An interactive Shiny app is provided to explain the model structure, methodologies, and performance metrics in a manner accessible to non-technical audiences.
- Objective: Utilise unsupervised learning methods to explore latent structures in film characteristics across 50 top-rated IMDb movies.
- Approach:
- Dimensionality reduction is performed using Principal Component Analysis (PCA) to identify key patterns and features within the dataset.
- Clustering techniques, including KMeans and Hierarchical clustering, are applied to group films based on their underlying characteristics.
- Objective: Investigate whether a linear or nonlinear decision boundary best classifies Coronary Heart Disease (CHD) in a high-risk male population from the Western Cape, South Africa.
- Approach:
- Analysis involves comparing multiple classification approaches to determine the most effective decision boundary.
- Emphasis on understanding the interplay between various risk factors and their influence on CHD classification.