Skip to content

How to Use GIMLET for Graph Data Represented by an Adjacency Matrix? #7

@FrankLiu1102

Description

@FrankLiu1102

I want to use GIMLET to process graphs represented by an adjacency matrix instead of molecular structures based on SMILES. For example, my dataset in CSV format looks like this:

, v1, v2, v3, v4, v5, v6
v1, 0, 1, 1, 1, 1, 0
v2, 1, 0, 1, 0, 1, 1
v3, 1, 1, 0, 1, 0, 0
v4, 1, 0, 1, 0, 1, 0
v5, 1, 1, 0, 1, 0, 0
v6, 0, 1, 0, 0, 0, 0

The first row and first column represent node names.
0/1 indicates whether an edge exists (unweighted graph).

I modified dataloaders/molecule_datasets.py to include my_dataset, allowing it to read CSV files.

I updated the process() method to avoid parsing SMILES and instead construct a torch_geometric.data.Data object directly from the adjacency matrix.

However, when running python downstream_test.py --dataset my_dataset ..., I encountered errors such as invalid smiles and IndexError: list index out of range.

Does GIMLET support processing general graph-structured data (e.g., adjacency matrices)?

If so, what is the recommended approach (e.g., a proper process() method) to convert an adjacency matrix into a GIMLET-compatible graph format?

If the current version of GIMLET is mainly designed for molecular data, are there any plans to extend support for more general graph structures?

I really appreciate your time and help! Looking forward to your response. 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions