Jordan Micah Bennett, software engineer/creator of "RobotizeJa".
The aim is to develop a quick way to detect the nCov 2019 (Coronavirus 2019/2020, also called disease: "Covid-19" stemming from virus: "SARS-CoV-2") strain, with the plan to use artificial neural networks or other machine learning model types.
This project began on January 29, 2020, here: SMART-CORONA_VIRUS_DETECTOR. This Xray-scan version began on Feb 9, 2020.
As this is the first known attempt, commencing on January 29 2020 aimed at collaborating to construct this type of program, please point to open source packages with similar goals. Please email jordanmicahbennett@gmail.com.
-
This can also reasonably allow for less experienced medical personnel to make preliminary diagnoses, expanding the diagnosis efforts overall. This effort may contribute towards virus-control progress, together with other ai based endeavours being developed across the globe, such as use of ai for vaccine development.
-
This convolutional neural network architecture can reasonably also be trained on CT-Scan image data (that many Covid19 papers seem to concern), separate from the Xray data (from the non-Covid19 Pneumonia Kaggle Process) upon which training occured.
-
Feb 9, 2020: I discover similarities between Covid19 and known forms of pneumonia, after which I find a few Xray-Images representing positive cases of Covid19 by Chinese authorities, where I decided to perform artificial intelligence based Xray Image Scan diagnostics, by using the images as inputs to an artificial intelligence based pneumonia diagnosis method originally published/made by John Chang in November 2019. This reasoning is seen in my research/discovery process in the Deep Learning Code section below.
- My repository includes a few fixes of the original repository, including a simple user interface to facilitate command-free automated covid19 diagnosis.
-
Feb 19, 2020: Scientists reveal a ~98% accuracy in human/radiology based CT Scan image based diagnostics, compared to the popular Dna polymerase chain reaction method by CDC: "In a series of 51 patients with chest CT and RT-PCR assay performed within 3 days, the sensitivity of CT for COVID-19 infection was ~98% compared to RT-PCR sensitivity of ~71% (p<.001)."
-
Feb 20, 2020: Great news - Feb 20 news report published, that Chinese are using Ai to help identify the virus with reported ~99% accuracy, via their own Ai based CT-scan method.
- Unfortunately, unlike this repository started by myself on Feb 9th, no Chinese publication of ai based algorithms seems to have been made to the public to help facilitate global control of covid19/SarsCov2.
-
Feb 26, 2020: Chinese researchers reveal free access to an artificial intelligence based online Covid19 Detection tool, although still, no code nor patient data revealed. As a result detection may be slow for users without good internet connection.
- I still call to have code/data released for enhanced covid19 spread control.
- One reason why China should reasonably release their code and data, is because their trained algorithm and data, while providing good basis, may also be susceptible to race based computation issues, simply due to the reality that most Covid19 patient/data are those of Chinese/race.
- My showcasing of this repository's code, and or my suggested publication of China's ai code may enable further training on data pertaining to race distributions of the target nation where Covid19 screening is applicable/required, as seen in other work that stresses accounts for diversity..
-
March 13, 2020: Kaggle launches large global effort to combat Covid19, with a call to action including data collections of oveer 29,000 Covid19 associated papers.
-
March 16, 2020: Adrian Rosebrock produced a Covid19 detector with ~90% accuracy, and ~80% sensitivity, using keras machine learning library, from a recent covid19 xray dataset released 4 days ago.
- Molecular and Serology Tests: ~20 minutes for equipment prep alone, for real-time polymerase chain reaction (RT-PCR), and up to 24 hours before testing is verified. See bullet point 3, in answer to question "Can someone who has had COVID-19 spread the illness to others?" on this CDC website/covid19 FAQ page.
- Xray Image Scan + Artificial Intelligence Diagnosis: ~5 minutes (for scan) + A few milliseconds for Ai diagnosis = ~6 minutes total time for diagnosis result including possible image processing.
Coronavirus: Whole world 'must take action', warns WHO
Update Jan 31, 2020/WHO declares the new coronavirus outbreak a Public Health Emergency of International Concern
- WHO's warning should reasonably have come about a week earlier, as advised about a week ago via Chris Martenson, who I also refer to below regarding his 115 million nCov case prediction count.
- Update February 7, 2020: Artificial Intelligence Prediction: In 45 days, ~2.5 billion to be infected, ~52 million of which may die.. See also this detailed forbes report.
- The nCov 2019 (Coronavirus Strain 2019/2020) is spreading rapidly, with a mortality rate between 2% and 4%.
- By comparison, the common flu with a far lower mortality rate of .1%, kills 291,000 to 646,000 per year.
- Things get worse; nCov spreads at ~tripple the transmission rate of the common flu.
- Common flu RO = 1.28 (Estimated, transmission rate)
- nCov RO = 2.5 to 3.8 (Estimated transmission rate)
- Recent nCov RO estimate ~4.08!
- Recent nCov 2019/Covid19 incubation period is estimated at 24 days, and a Chinese woman was recently struck down with symptoms after probation period of 15 days according to the sun newspaper!
- Current diagnosis methods may miss the presence of the virus due to faulty dna based comparison methods, where multiple negative test results may occur before positive results are gained. In addition, more doctors (or rather more automated diagnosis methods) can improve identification rates of the virus.
- This ai driven method will reasonably help to stop the exponential growth of the nCov strain.
- 1 more month of exponential nCov growth = ~ 115 million cases, (of which ~ 23 million are potentially life threatening ones) according to an epidemiologist/PhD pathologist.
Code
-
Covid-19/Coronavirus2019/nCov share many similarities with pneumonia. In fact, the time course evolution of a specific strain of covid-19 pneumonia is studied here.
-
There are already existent pneumonia deep learning platforms, including kaggle contents rife with deep learning kernels/solutions, pertaining to pneumonia detection.
-
A pretrained neural network is chosen from google, pertaining to (2). Pretrained model usage is a way to avoid training on the 2 gigabytes of pneumonia/non-pneumonia training set.
- I added a quick function "doOnlineInference" to the code. This is a convenient way to invoke diagnosis on input image.
-
A covid-19 positive xray scan is taken from figure 1a of this recent covid-19 paper.
-
Another covid-19 positive xray scan is taken from figure 1 of this covid-19 paper.
-
The function from (3) was invoked on (4), and (4) was successfully detected as covid-19 positive, aka high confidence of pneumonia. The function from (3) was also invoked on (5), although that prediction had a very low confidence that the input was normal/non-pneumonia. All covid19 positive input images be it (4) or (5) induced prediction of high neural network confidence of the presence of covid19 pneumonia.
- Deep learning based upscaling was applied to input image 5, which was of low resolution compared to the training data from kaggle.
- Upscaling changed the results for input (5) where the model predicted even lower confidence of non-pneumonia i.e. closer to ground truth, but UPSCALING did not change the result for input (4) which was initially high/closer in resolution to the smallest res sample in the kaggle dataset.
- This could be a good/preliminary sign that this tool could be used to actively detect novel coronavirus cases from Xray scans.
-
Preliminary Conclusion
- This will reasonably work on potential mild-covid-19 pneumonia patients, within ~0 to 4 days of infection, with "repeated pulmonary CTs", where positive findings of pneumonia associated abnormalities are discoverable.
- This will likely work better for patients after ~5 days of infection of covid-19, as abnormalities become distributed across the lungs, where initial CT scans could better discover the Covid-19 markers.
- See the paper's conclusion for the reasoning above.
- Download entire repository, which contains my version of the original code from another repository by John Chang.
- Download the saved weights from the original repository, and ensure both the code and weights are in same place.
- Download the 2 gigabytes training/test data from kaggle.
- Run doOnlineInference function from my version of the original code on any of the test data from the 2 gigabytes kaggle directory, or on the single positive covid-19 example seen in this repository, that was taken from figure 1a of this recent covid-19 paper.
Update: February 18, 2020
-
Except for item (4), follow all instructions from "Code setup (basic user interface)" section above.
-
Run my user interface, which works with my version of the original code from this repository. One can either double click the covid19_ai_diagnoser_ui.py file, or open the file with IDLE, and run there.
-
Select an image that pertains to a suspected case, although in the Screenshot:
-
Notice the log with the results of the neural network's prediction above the large blue arrow, and the image has been loaded for viewing right of small blue arrow:
-
The model is ~92% accurate on the original task of pneumonia/non-pneumonia classification.
CT Scan Manual Diagnosis and Explosion in infection reports
- By extension, apart from human radiologist detection, perhaps an ai based image detection solution can speed up diagnosis, and help to replace the faulty dna based comparison phase. I've also requested more CT image data from a scientist involved with manual diagnosis using CT scan data.
-
Images from recent covid-19 study: "Emerging Coronavirus 2019-nCoV Pneumonia"
-
Images from recent covid-19 study: "Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review"
By extension, the tool by researchers at John Hopkins University below, is useful for real time tracking of nCov:
https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Note that despite the ~900+ infection-case number reported via China on January 24, by stark contrast, a medical scientific paper estimated that ~105,000+ infections actually occured at that time.
I call on the Ministry of Health of Jamaica (as well as other countries) to utilize their administrative status to try to acquire more covid19 positive CT scan images (in federated format that excludes patient identity), from China etc, for improving pneumonia based ai systems, like the one that I had prepared since February 9, 2020, which I found to successfully detect covid19 presence in a small covid-19 positive Xray scan sample set found online so far, in a paper by Yuen et al etc.
- Alternatively, the Chinese artificial intelligence algorithm/solution together with the data could be attained using the same administrative method.
- In future scenarios, a "Division of Artificial Intelligence Based Health Development" or sector of artificial intelligence based research should reasonably exist in the Ministry of Health, that could enable Ai solutions to be rapidly researched/developed, to facilitate production of vaccines, and treatment, as seen in a recent example where MIT developed antibiotics based on Ai research/development.
My advice to Ministry of Health (February 17, 2020): https://drive.google.com/file/d/1BNXkKJPZuMx64XzwqFmQEpC5s9-C3tJH/view?usp=sharing
-
Jordan added fix to original author's repository, to enable correct validation. John Chang had inadvertently misdefined some "test_dataGen.flow_from_director" function parameter as a training dataset input, instead of a test dataset input.
-
Jordan updated his version of the original code, such that a compile issue is repaired, in order to facilitate accuracy evaluation of the saved/loaded (in 2 minutes on gtx 1060/i7 cpu) model without invocation of model-training function model.fit, which would take hours on the same machine.
-
Based on Andrei's suggestions, Jordan replaced erroneously labelled CT labels, with X-Ray, that Jordan had initially mis-labelled. This correction is very important, and could influence model architecture later on.
-
Code no longer runs on John Chang's base code. Jordan has written new diagnoser code, to accomodate a new code base.
- For the task of pneumonia detection, the new code base has far higher sensitivity/recall (~89%), specificity (~88%) as seen in the new screenshot, compared to John Chang's code, which had: sensitivity/recall (~33%), specificity (~67%).