This repository contains my submission for the Kaggle Leash-BELKA competition. The objective of the competition is to predict molecular properties based on the given dataset.
- molecule-properties-model.ipynb: Jupyter notebook containing the data analysis, feature extraction, and model training processes.
- Data Processing: Utilizes libraries such as Pandas and RDKit for handling and analyzing molecular data.
- Feature Engineering: Generates molecular descriptors to use as features in the model.
- Modeling: Implements the CatBoost classifier for predicting molecular properties based on the generated features.
The notebook is structured to demonstrate the process of loading data, preprocessing, and training the model. Please refer to the notebook for detailed outputs and evaluations.
- Clone this repository.
- Install the required dependencies using:
pip install schrodinger openbabel rdkit joblib
3.Open the Jupyter notebook and run the analyses:
jupyter notebook molecule-properties-model.ipynb