Multi-Task Learning Neural Network for Sparsely Labelled Data

Overview

This project utilises a hard parameter sharing approach for multi-task learning in neural networks to model multiple material properties. The architecture employs shared hidden layers for multiple tasks before branching into task-specific networks.

The primary problem addressed is the training of sparsely labelled data, such as non-concurrent hardness and corrosion measurements in disparate datasets in materials science. It's implemented together with model ensembling.

This repository contains the complete workflow: exploratory data analysis, data preparation, model compilation, training, evaluation, and explainability steps.

Methodology

The training algorithm operates iteratively, alternating between data types (hardness/corrosion) to update the weights in both shared and task-specific networks. This method allows separate yet intrinsically linked models for hardness and corrosion to share weights in the hidden layers.

Key Features

Shared and Task-Specific Networks: Leverages shared representations to improve learning efficiency and reduce overfitting.
Model Ensemble and Uncertainty Quantification: Employs ensemble learning for handling heterogeneous small datasets. Together with the Monte Carlo dropout, it quantifies the model's uncertainty by the variance of multiple inferences per input.
Hyperparameter Optimization: Utilises Bayesian Optimisation for efficient hyperparameter search (especially subnetwork configurations), optimising model performance based on (R^{2}) score.
Model Explainability: Implements a Shapley value-based framework for local explainability of model predictions.
All the above features are unified in one model.

Implementation

The project is implemented using TensorFlow/Keras, with dimensionality reduction by UMAP, Bayesian hyperparameter tuning by GPyOpt, and feature attribution by a gradient-based explainer, etc.

The original datasets used in this project: mechanical properties (Borg et al.) and corrosion properties (Nyby et al.).

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
01_Dataset_Cleaned		01_Dataset_Cleaned
02_Dataset_EDA_Feature_UMAP_Mahalanobis		02_Dataset_EDA_Feature_UMAP_Mahalanobis
03_Model_Train_Evaluate_Predict		03_Model_Train_Evaluate_Predict
04_Model_Saved		04_Model_Saved
05_Model_BO_batch		05_Model_BO_batch
__pycache__		__pycache__
utils		utils
.gitignore		.gitignore
CCA_representation_ML.code-workspace		CCA_representation_ML.code-workspace
Fig_2_MTL.png		Fig_2_MTL.png
Fig_Alg.png		Fig_Alg.png
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Task Learning Neural Network for Sparsely Labelled Data

Overview

Methodology

Key Features

Implementation

About

Uh oh!

Releases

Packages

Languages

YXWU2014/CCA_representation_ML

Folders and files

Latest commit

History

Repository files navigation

Multi-Task Learning Neural Network for Sparsely Labelled Data

Overview

Methodology

Key Features

Implementation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages