Principal Component Regression for chemical spectroscopy multivariate calibration. Written in Python with TKinter-based GUI.
This tool reads spectral CSV data, applies Savitzky–Golay preprocessing, performs PCR calibration, and outputs the predicted concentrations as a CSV file.
The Principal Components Regression (PCR) is a combination of Ordinary Least Squares Regression (OLS) with Principal Components Analysis (PCA). PCA reduces the dimensionality of the spectral data while reducing the multicollinearity, this helps the model to differentiate the values of the independent variables. Then, OLS models the linear relationship between the variables. As a result, PCR provides a multivariate linear model that is more robust than the classical calibration method.
In that process, the dimensionality of the spectral data is reduced to principal components, which the optimum number is determined by cross-validation.
To run the PCR.py
file make sure the following dependencies are installed:
- Pandas
- Numpy
- Matplotlib
- Scipy
- Scikit-learn
- Tkinter
- Customtkinter
After running PCR.py
, the GUI will prompt you to upload four .csv
files:
- Spectral data (absorbance) of the standards (training)
- Wavelength or wavenumber values the spectral data were colected
- Known concentrations of the standards (training)
- Spectral data (absorbance) of the sample
Each file must follow a specific format:
You can test the application using:
- standards_spectral.csv
- wavelength.csv
- standard_concentrations.csv
- sample_simulation.csv
Output:
- Preprocessing and calibration plots will be shown.
- Model performance metrics will be shown.
- Final predicted concentrations are saved as "predicted_concentrations_pcr.csv".