This project demonstrates how to load and process the Delaney Solubility dataset using pandas
. The dataset contains chemical descriptors and their corresponding solubility values, which can be used to predict the solubility of a compound.
The dataset used in this project is the Delaney Solubility Dataset, which is available from the Data Professor's GitHub repository. The dataset contains the following columns:
- logS: The solubility value of the compound in logarithmic scale.
- Various chemical descriptors that represent different properties of the compounds.
The following Python libraries are required to run the script:
- pandas: For data manipulation and analysis.
- sklearn : For data splitting and prediction.
- matplotlib : For plotting the prediction graph.
You can install the required dependencies using pip:
pip install pandas
pip install sklearn
pip install matplotlib