Analyzed a dataset consisting of diagnostic criteria taken from 768 Pima Indian women and predicted the presence of diabetes using classification models built with R.
"Diabetes Classification Final Report.pdf" is a finalized report which summarizes the entire project and describes its results.
"diabetes.csv" is the dataset used in this project.
There are several smaller reports which were written at various stages of the project. "Data Quality Report.pdf" summarizes the data, visualizes it, and assesses its quality. "Data Quality Plan.pdf" describes decisions made during preprocessing. "Classifier Report-1.pdf" and "Classifier Report-2.pdf" describe the results of using various classifiers on the preprocessed data.
"Data Quality Report.R," "Preprocessing.R," "classifiers1.R," and "classifiers2.R" are R scripts used to analyze the data, preprocess it, and build and assess classifiers.