[https://github.com/Metaphysicist1/Predicting-Retirement-Age/]
About This is a data science project focused on predicting retirement age using advanced machine learning techniques applied to the Survey of Health, Ageing and Retirement in Europe (SHARE) Wave 6 dataset. This longitudinal micro-data infrastructure provides rich information on health, socio-economic factors, and retirement behaviors of individuals aged 50+ across 17 European countries and Israel. The project aims to uncover insights into the aging process and inform policy by developing robust classification models to predict retirement age categories (e.g., early, normal, late). Key steps include comprehensive data preprocessing, model development, and evaluation, with a focus on bio-medical and socio-economic predictors.
Acknowledgment: This project uses data from SHARE Wave 6 (DOI: [insert relevant DOI]), see Börsch-Supan et al. (2013) for methodological details.
Classification
- Best Model: XGBoost Classifier
- Evaluation Metric: Accuracy, AUC-ROC
- Result: 85% Accuracy, 0.88 AUC-ROC
-
Literature Review
Overview of existing research on aging, retirement, and machine learning applications, including key references for SHARE (Börsch-Supan et al., 2013; Malter & Börsch-Supan, 2017). -
Dataset Characteristics
Exploratory data analysis (EDA) of SHARE Wave 6, including visualizations of demographic, health, and socio-economic variables. -
Baseline Model
Implementation and evaluation of a baseline logistic regression model for predicting retirement age categories. -
Model Definition and Evaluation
Development and evaluation of advanced models (Random Forest, XGBoost) with feature importance analysis and performance metrics. -
Presentation
Presentation slides summarizing the project methodology, results, and policy implications.
- Börsch-Supan, A., Brandt, M., Hunkler, C., Kneip, T., Korbmacher, J., Malter, F., Schaan, B., Stuck, S., & Zuber, S. (2013). Data Resource Profile: The Survey of Health, Ageing and Retirement in Europe (SHARE). International Journal of Epidemiology, DOI: 10.1093/ije/dyt088.
- Malter, F., & Börsch-Supan, A. (Eds.) (2017). SHARE Wave 6: Panel innovations and collecting Dried Blood Spots. Munich: Munich Center for the Economics of Aging (MEA).
- Data Access: The SHARE Wave 6 dataset is available through the SHARE Research Data Center (http://www.share-project.org) upon registration and compliance with data usage agreements.
- Ethical Use: Ensure ethical handling of sensitive health and socio-economic data as per SHARE guidelines.
- Dependencies: Python (pandas, scikit-learn, XGBoost, matplotlib, seaborn) for analysis and modeling.