For this project, the possible identities and relative compositions of the chemical compounds comprising an unknown ginger oil sample were studied via an instrumental technique known as gas chromatography-mass spectrometry (GC-MS).
Data on the observed GC peaks corresponding to the separated components, distribution of mass fragments in MS, and the instrument-matched candidate structures (or MS hits per GC peak) were collected via GC-MS machine (Shimadzu GCMS-QP2010) and processed into .TXT and .CSV files.
A Python program utilizing the Pandas and Regex (re) modules was developed to summarize the aforementioned information into sorted tables for the most significant peaks, names of candidate structures, and relative composition using textual analysis. Using the program’s output CSV files, the most likely candidate structures (e.g. beta-curcumene, zingiberene) for the chief components of ginger oil were manually determined based on data from various scientific literature.
Although the GC-MS method was not able to completely narrow down the identities of the ginger oil components, the program suggests that textual analysis methods may be used to process and summarize GC-MS data for the study of numerous components and composition of plant oils.