Skip to content

nforsg/SF290X_github

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PMI Estimation using micrscopic image samplling

Project desciption

We have collected samples of biological material from the nose and mouth of 7 cadavers between day $0$ and day $5$ after death, at the Australian Facility for Taphonomic Experimental Research (AFTER). The goal is to develop a mathematical framework or a predictive model that estimates the postmortem interval (PMI), which is the time, measured in days, since a person has been deceased. Current methods are either inaccurate or requires much resources and skilled practitioners within DNA sequencing to collect data for predictive models

Approach

This project investigates how DNA sequencing methods on microbe communities can be mitigated by instead imaging them with phase-contrast micrscopy. This will be the raw data to the model.

Solution after conducting literature research

Because of the character of the data, that is collected in few and discrete days, the thesis writer has decided to construct an image classification task out of the generated images. To use machine learning image recognition models to predict PMI is away to develop methods at the rate in which biological data is collected, which is the core pillar of this project.

Convolutional Neural Network Candidates

The two most relevant articles from the literature study are:

  • CytoImageNet - Stanley Hua et. al. This article has developed a new dataset called CytoImageNet, to train neural networks on solving classification tasks in micrscopy bioinformatics. They have trained the EfficientNetB0 on the dataset, and have posted the weights open source
  • Transfer Learning with Deep CNNs for morphology classification - Spjuth et. al. This article is cited by Hua et. al., and utilizes transfer learning to solve two particular classification tasks
  • Realted classification tasks, and how they connect to PMI Estimation

Both Hua et. al. and Spjuth et. al. solves the BBBC021 compound profiling experiment. This is about classifying different compounds stained on breast cancer cells to different methods of action (MOA), using solely images of breast cancer cells stained with the different compounds as raw data.

These studies have been excellent inspirations to my project. The compound itself is irrelevant, but instead, only the shape of different cells are classified into the MOAs. For me, different compounds are different cadavers and facial sites, and images of cells are to be classified into the correct PMI based on the relative abundance of different morphological profiles seen on the images.

Documentation

Disregard all folders except the folder called "Project code". That's where all the project-related code is, the rest is a learning process throughout the semester.

All-in-all, I have taken about 20,000 microscopy images from 7 cadaver samples. From hua et. al. and Spjuth et. al. I have decided to solve the classification problem using EfficientNetB0, ResNet50, InceptionV3 and InceptionResNetV2. So far, I have produced results for EfficientNetB0 (completely done) and ResNet50 (almost finished). Tomorrow, I will do the Inceptions as well. You can see all the results in the notebooks related to each network.

I have done a "Utils" file that stores all functional code, including plotting functions as well as the image preprocessors previous to feeding them to the model.

Questions

These questions relate to EfficientNetB0 and ResNet50 and how I will go on

  • Transfer learning Jag har använt mig av transfer learning och fryst alla lager i base_model till imagenet-vikter. Endast dom lager jag självlägger till i slutet är träningsbara. Just nu tar den träningen ganska lång tid på min GPU, och det i kombination med att själva syftet med projektet handlar om att upptäcka modeller som underlättar arbetsprocessen och har potential för utveckling, gör att jag inte kommer att träna dessa vikter. Hade du tänkt likadant om du skulle publicera en artikel om detta projekt?
  • Parameter tuning and data apporach

Just nu har jag gjort ett explicit live-script för varje CNN jag testar att göra predictions på. JAg har delat upp allt i två sessions:

  • OLD Session

Här har jag samples från cadaver 4-7, och gör prediction initialt utan regularisering

  • NEW Session

Här inkluderar jag cadaver 9-11 och får således ungefär 8000 fler input data. Min utvärdering kommer ligga mycket i hur varje CNN modell klarar av att göra prediction på valideringdatat då den får mer data. Ju mer data vi får från olika kroppsdonationer, ju mer kan vi göra med modellerna, och därför är det viktigt att hitta en modell som inte har problemet att dengör bra ifrån sig på OLD session men dåligt på NEW session. Är det en rimlig motivering för diskussion av resultat i detta projekt?

  • Overall code practice

När du går igenom koden lite snabbt, är det något du ser som du klart skulle gjort annorlunda än vad jag har gjort för att få bättre resultat?

Återigen, koden är i stort sett helt redo att få resultat för samtliga CNNs, jag behöver bara köra dom

  • Analys of fine-tuning av resultat

Jag kommer framförallt bara intriducera regularisering och möjligen fler epochs som ändringar i CNN-modellerna, inget mer. Räcker det tycker du?

About

for my master's thesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages