Final project for ST541 Probability, Computing, and Simulation in Statistics (PCS) at OSU
Expand the E-step and the M-step of the EM algorithm in the context of the Gaussian mixture model, apply the algorithm to a simulated dataset that follows a two-component Gaussian mixture distribution, and evaluate its performance.
Lin_ST541_Project_FULL contains the full report.
Lin_ST541_Project is an abridged version of the full report.
00_simulated_data contains codes that simulate data
01_initialization_kmeans contains codes that run K-means clustering
02_EM_step contains drafts of the functions and can be skipped. Instead, descriptions, codes, and usages of the functions can be found at e_step and m_step in the R folder
03_EM_iteration contains codes that iterate between the E-step and the M-step and check for convergence
04_Reporting contains codes that produce the (abridged) report
05_Presentation contains codes that produce slides for the presentation
06_Appendix contains a rough sketch of the implementation in words