🚀 LLM as GNN: Graph Vocabulary Learning for Text-attributed Graph Foundation Model

Please be aware that this repository is still under maintenance. We will release a more polished and formal version as soon as possible.

🔎 Overview

This repository contains the code implementation for the paper LLM as GNN: Graph Vocabulary Learning for Text-attributed Graph Foundation Model.

Abstract

Graphs typically exhibit distinctive structure and domain-specific knowledge, motivating the development of a Graph Foundation Model (GFM) capable of generalizing across various graphs and tasks. While recent efforts have focused on combining the strengths of Large Language Models (LLMs) and Graph Neural Networks (GNNs), they often struggle to maximize mutual benefit due to the decoupled architectures. Moreover, existing methods assign out-of-vocabulary (OOV) tokens to nodes, which are incompatible with the natural language vocabulary for task-oriented prompt generation, hindering knowledge transfer in GFM. In this paper, we introduce PromptGFM, a versatile GFM grounded in graph vocabulary learning, comprising two key components: (1) Graph Understanding Module, which explicitly replicates the finest GNN workflow in the language space using LLMs, enabling seamless GNN-LLM integration and elegant graph-text alignment; (2) Graph Inference Module, where we establish a novel language-based graph vocabulary to ensure expressiveness, transferability, and scalability. This vocabulary enables the generation of readable instructions for LLM inference, resolving modality incompatibility and facilitating positive transfer. Extensive experiments demonstrate the superiority of PromptGFM in node classification and link prediction, along with its strong transferability across different datasets and tasks.

Environment💾

Set up an environment

Navigate to the directory containing the environment.yaml file in your terminal, then run the following command to create the environment based on the YAML file:

conda env create -f environment.yaml

Run 🧑‍💻

Step 1: Graph Understanding Module

python ./generator/generate_textual_id_V3.py --dataset citeseer

Step 2: Readable Instruction Construction in Graph Inference Module

For link Prediction:

python ./data_prepocess/LP_prepocess.py --dataset citeseer

For Node classification:

python ./data_prepocess/NC_prepocess.py --dataset citeseer

Step 3: Multi-Prompt Instruction Fine-tuning in Graph Inference Module

Train the Model

python train.py --task link_prediction --dataset citeseer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 LLM as GNN: Graph Vocabulary Learning for Text-attributed Graph Foundation Model

🔎 Overview

Environment💾

Run 🧑‍💻

Step 1: Graph Understanding Module

Step 2: Readable Instruction Construction in Graph Inference Module

For link Prediction:

For Node classification:

Step 3: Multi-Prompt Instruction Fine-tuning in Graph Inference Module

Train the Model

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Decoding		Decoding
Train		Train
data_prepocess		data_prepocess
dataset		dataset
generator		generator
README.md		README.md
environment.yaml		environment.yaml

agiresearch/PromptGFM

Folders and files

Latest commit

History

Repository files navigation

🚀 LLM as GNN: Graph Vocabulary Learning for Text-attributed Graph Foundation Model

🔎 Overview

Environment💾

Run 🧑‍💻

Step 1: Graph Understanding Module

Step 2: Readable Instruction Construction in Graph Inference Module

For link Prediction:

For Node classification:

Step 3: Multi-Prompt Instruction Fine-tuning in Graph Inference Module

Train the Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages