-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I want to use GIMLET to process graphs represented by an adjacency matrix instead of molecular structures based on SMILES. For example, my dataset in CSV format looks like this:
, v1, v2, v3, v4, v5, v6
v1, 0, 1, 1, 1, 1, 0
v2, 1, 0, 1, 0, 1, 1
v3, 1, 1, 0, 1, 0, 0
v4, 1, 0, 1, 0, 1, 0
v5, 1, 1, 0, 1, 0, 0
v6, 0, 1, 0, 0, 0, 0
The first row and first column represent node names.
0/1 indicates whether an edge exists (unweighted graph).
I modified dataloaders/molecule_datasets.py to include my_dataset, allowing it to read CSV files.
I updated the process() method to avoid parsing SMILES and instead construct a torch_geometric.data.Data object directly from the adjacency matrix.
However, when running python downstream_test.py --dataset my_dataset ..., I encountered errors such as invalid smiles and IndexError: list index out of range.
Does GIMLET support processing general graph-structured data (e.g., adjacency matrices)?
If so, what is the recommended approach (e.g., a proper process() method) to convert an adjacency matrix into a GIMLET-compatible graph format?
If the current version of GIMLET is mainly designed for molecular data, are there any plans to extend support for more general graph structures?
I really appreciate your time and help! Looking forward to your response. 🙏