This repository contains the source code for the DeSocial project. DeSocial is a decentralized social network framework that utilize blockchain infrastructure and graph learning models to enable transparent, user-driven, and personalized social network predictions. The project website link is here.
DeSocial/
│
├── blockchain/ # Blockchain simulation environment and client scripts
│ ├── blockchain.py # Blockchain infrastructure
│ └── user.py # Actions of the users (request, vote, train models, aggregation, etc.)
│
├── contract/ # Solidity smart contracts for validator voting, user actions, etc.
│ ├── contracts/contract.sol # The Solidity smart contract.
│ └── other files: # The compilation environment.
│
├── data/ # Processed data.
| └──$DATASET # The dataset name
│ ├── edge_list.csv # The graph edge data.
│ └── node_feat.npy # The input node features.
│
├── model/ # Graph learning models, and personalized algorithm selection module (e.g., GraphSAGE, GCN, GAT, etc.)
│ ├── dispatcher.py # Model dispatcher, returning an instance of model given its name.
│ ├── models.py # Graph algorithm classes.
│ └── select.py # The personalized algorithm selection module.
│
├── utils/ # Utility functions and helpers
│ ├── DataLoader.py # Data loader.
│ ├── EarlyStopping.py # Early stopping of the graph training.
│ ├── configs.py # DeSocial config settings.
│ ├── metrics.py # Evaluation metric calculation.
│ └── utils.py # Misc functions. (negative sampling, initiate validator groups, etc.)
│
├── eval.py # Evaluation functions.
│
├── run.py # Main entry to run the pipeline (including the decentralized multi-validator consensus module).
│
├── ganache_install.md # Ganache installation instructions.
│
└── requirements.txt # DeSocial execution environment dependencies.
Here gives the framework of DeSocial (both modules enabled).
Step 1: User
Step 2: The blockchain collects all user requests, constructs
Step 3: Each validator selected in this period independently trains their own graph learning model based on the data
Step 4: User
Step 5: Validator nodes retrieve
Step 6: One selected validator in each community executes the sampling task using algorithm
Step 7: The result of each algorithm trial is returned to
Step 8: User
Step 9: Validators in
Step 10: The period ends, all the nodes in the network copy the social network data
For the details of these notations, please refer the problem definitions in our paper.
For the dataset we use, we provide the url for the edge_list.csv
file.
- UCI (uci.zip, ml_uci.csv) [Towards Better Evaluation for Dynamic Link Prediction, NeurIPS 2022]
- Memo-Tx is processed by ourselves. Please refer to data/Memo-Tx.
- Enron (edge_list.csv) [DTGB, NeurIPS 2024]
- GDELT (edge_list.csv) [DTGB, NeurIPS 2024]
The node features is generated by np.random
, and we recommend to use our provided .npy
files.
We attach the datasets we use in DeSocial here for all datasets.
The graph training algorithms are implemented based on the open-source DTGB benchmark.
After downloading the repository, please install all the dependencies by
python -m venv DeSocial
source DeSocial/bin/activate
pip install -r requirements.txt
If you want to deactivate the environment, simply run
deactivate
For the setup of ETH Ganache environment, please follow ganache_install.md
.
To create DeSocial smart contract jsons and import them into python, we use Truffle to compile the smart contracts written in Solidity programming language.
To install Truffle, you can use npm
.
npm install -g truffle
Following the instructions to generate the smart contract json:
cd contract
truffle compile
After the compilation, you will see the smart contract json in ./contract/build/contracts/DeSocial.json
To run DeSocial, please run
python run.py
To quickly reproduce the result of DeSocial in the best configuration, please run
python run.py --cuda $CUDA --dataset_name $DATASET --f_pool $F --experts $EXPERTS --metric $METRIC --start_period 28 --load_best_configs
The range of some important arguments are specified below:
$F in [MLP, GCN, GAT, SAGE, SGC, PA] (PA for enabling personalized algorithm selection.)
$DATASET in [UCI, Memo-Tx, Enron, GDELT]
$METRIC in [Acc@2, Acc@3, Acc@5]
For example, if you want to quickly reproduce DeSocial-X (with validator community size of 5), X is one of the backbones, let's say SGC on UCI, please run
python run.py --cuda 0 --dataset_name UCI --f_pool SGC --experts 5 --start_period 28 --metric Acc@2 --load_best_configs
If you want to reproduce DeSocial-PA on UCI, please run
python run.py --cuda 0 --dataset_name UCI --f_pool PA --experts 1 --start_period 28 --metric Acc@2 --load_best_configs
If you want to reproduce DeSocial-Full on UCI, please run
python run.py --cuda 0 --dataset_name UCI --f_pool PA --experts 5 --start_period 28 --metric Acc@2 --load_best_configs
If you want to reproduce DeSocial on UCI at a given backbone selection pool {GraphSAGE, SGC}, please run
python run.py --cuda 0 --dataset_name UCI --f_pool SAGE+SGC --experts 5 --start_period 28 --metric Acc@2 --load_best_configs
use "+" to combine the backbone names.
If you want to disable the decentralized multi-validator consensus, please run
python run.py --cuda 0 --dataset_name UCI --f_pool PA --experts 1 --start_period 28 --metric Acc@2 --load_best_configs
If you just want to try a single backbone, please run
python run.py --cuda 0 --dataset_name UCI --f_pool SGC --experts 1 --start_period 28 --metric Acc@2 --load_best_configs
Different from centralized algorithms, in each period the blockchain selects different validators. As there are tens of thousands of nodes, it's likely to train every validator given random ML parameters. Therefore, we can reproduce the result from the first testing period. As t+2=30 (the first testing period), the start period is set to 28.
We reported the run time based on observing one evaluation metric because the overload of voting and aggregation is high in serial, not parallel.
If you found the program is terminated due to blockchain issue, just wait for a few minutes and rerun it.