SmartMap: Architecture-Agnostic CGRA Mapping using Graph Traversal and Reinforcement Learning

SmartMap is a framework that utilizes an actor-critic reinforcement learning method applied to Monte-Carlo Tree Search (MCTS) to learn how to map a DFG onto a CGRA. This framework offers full portability by using a state-action representation layer in the policy network, instead of a probability distribution over actions. Moreover, it integrates a graph traversal placement method to enhance scalability and improve the efficiency of MCTS by enabling more effective exploration during the search.

Summary

Paper
Next Steps
Project Structure
Setup
Hyperparameters
- Pre-training
- Mapping
Results
- Pre-training
  - Visualise All Pre-Training Results
- Mapping
  - Example of an Interpretable Mapping
Workflow

Paper

Accepted.

Next steps

Since the code runs slowly for experiments with 8×8 and 16×16 configurations, we are improving its efficiency by optimizing the code and using C++ in the new version. Additionally, we are implementing new baselines, conducting further experiments, and providing a training script to maximize the potential of SmartMap, i.e., enabling training on any DFG and CGRA.

Project Structure

The structure of this project is organized as follows:

benchmarks: This directory contains all collected benchmarks. benchmarks/changed_MCTS_benchmark contains the DFGs used for mapping, and benchmarks/synthetics contains the DFGs used for pre-training;
configs: All configurations for the experiments, including hyperparameters;
results: Tracked metrics for pre-training, interpretable mappings, checkpoints, images, and CSVs with information about the mapping and pre-training results;
scripts: Shell scripts to facilitate reproducibility and the use of the code;
src: All source code;

Setup

Machine Configuration

The experiments were performed on:

Ubuntu 24.04.1 LTS
GPU: 4GB NVIDIA GeForce RTX 3050
- Driver Version: 550.107.02
- CUDA Version: 12.4
16 CPUs: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz

Prerequisites

Python 3.12.3
PIP 24.0

Install the Dependencies

Install the dependencies before running the code:

git clone https://github.com/lesc-ufv/SmartMap.git && cd smart_map
chmod +x setup.sh && ./setup.sh

Hyperparameters

The following tables present the main hyperparameters used, along with their values and a brief description. It's important to note that SelfMapping in train mode refers to a mapping where the agent does not always take greedy actions, whereas SelfMapping in test mode performs mapping in a greedy manner, i.e., without exploration. In zero-shot mapping, only test mode is utilized.

Pre-Training

Hyperparameter	Description	Value Type	Used Value
epochs	Number of epochs during pre-training	int	400
iters	Number of times to train on all batches	int	3
batch size	Number of samples to train in each step	int	64
clip	Clip value	float	0.2
initial_lr	Initial learning rate	float	1e-3
optimizer	Optimizer	torch.optim	AdamW
decay_step	Number of epochs before decaying the learning rate	int	1
decay_rate	Rate to decay the learning rate	float	0.99
weight entropy loss	Weight of entropy loss in the objective function	float	Starts at 0.1 and decreases to 0.01 as training advances
weight kl	Weight of KL loss in the objective function	float	Adjusted dynamically
weight value	Weight of value loss in the objective function	float	1

Mapping

Variable Name	Description	Value Type	Used Value
batch_size	Batch size during training in finetune mapping	int	32
checkpoint_interval	Number of training steps before updating the model weights in SharedStorage	int	1
device	Device to train the model	String	cuda
discount	Discount factor	float	0.997
lr_decay_rate	Decay rate of the learning rate	float	0.95
lr_decay_steps	Steps to decay the learning rate exponentially	int	100
lr_init	Initial learning rate	float	0.01
max_moves	Max number of placement actions if mapping does not finish before	int	Number of PEs
momentum	Optimizer parameter	float	0.9
num_max_expansion_test	Max number of children to expand during the expansion stage in SelfMapping during test mode	int	100
num_max_expansion_train	Max number of children to expand during the expansion stage in SelfMapping during train mode	int	200
num_simulations	Number of MCTS simulations	int	Vary according to the architecture size and interconnection style. See `src/utils/util_configs_train.py`
num_unroll_steps	Number of placement actions to keep in a batch for each mapping sample	int	Number of PEs
num_workers	Number of SelfMapping instances running in parallel in train mode	int	10
optimizer	Optimizer for the model	String	"Adam"
pb_c_base	UCB constant 2 according to MuZero equation in the MCTS selection step	float	19652
pb_c_init	UCB constant 1 according to MuZero equation in the MCTS selection step	float	1.25
PER	Whether to use prioritized replay	bool	False
PER_alpha	Degree of prioritization	float	1
ratio	Ratio of training to mappings	float	Starts at 1/5 and decreases to 1/1 as training advances
reanalyse_on_gpu	Whether Reanalyse will use the GPU	bool	False
replay_buffer_size	Number of mappings to maintain in the buffer	int	64
root_dirichlet_alpha	Parameter for Dirichlet distribution to create noise	float	0.25
root_exploration_fraction	Proportion of noise to use in root prior node	float	0.25
seed	Seed for NumPy and PyTorch	int	1234
selfplay_on_gpu	Whether SelfMapping will use the GPU	bool	False
self_play_delay	Time that SelfMapping waits before generating a mapping again	float	0.0
training_delay	Time that Trainer waits before training the model again	float	0.0
td_steps	Temporal difference steps	int	Starts at 25% of CGRA size and increases as training advances
training_steps	Max number of training steps during finetune mapping	int	500
train_on_gpu	Whether training will be done on the GPU	bool	True
use_last_model_value	Whether to use the last model's weights in Reanalyse	bool	True
value_loss_weight	Weight for value loss during training	float	1.0
visit_softmax_temperature_fn	Function to update the stochasticity during action selection in SelfMapping in train mode	function	See `utils/SoftmaxTemperature.py`
weight_decay	Optimizer parameter	float	0.0001

Results

This section presents how to access our results and provides a brief description of them.

Pre-Training

For pre-training, the following metrics were tracked: policy loss, value loss, entropy loss, KL divergence loss, total loss, learning rate during training, and mean reward during evaluation. Moreover, all checkpoints can be found in results/checkpoints/, and the training time is recorded in results/train_results.csv. Finally, Figure 1 shows the mean reward obtained by SmartMap in a mesh-like 4x4 CGRA.

Figure 1

Visualise All Pre-Training Results

In the root directory of the project, run the following command and follow the instructions:

tensorboard --logdir results/

Mapping

For mappings, information such as whether the mapping is valid, routing penalty (sum of all rewards of the mapping), test mode (zero-shot or finetune), reason for unsuccessful mapping (if applicable), and mapping time are stored in mapping_results.csv, while interpretable mappings are provided in results/mappings/. Note that MCTS metrics are also saved and will be used for further analysis.

Example of an Interpretable Mapping

This section presents the result of mapping onto a 4x4 mesh-like CGRA for DFG V_8_E_9.

DFG V_8_E_9

Mapping

Timing can be understood as scheduling.

-------------------------------------------------- Mapping V_8_E_9.dot --------------------------------------------------

Successful Mapping | Routing Penalty: -0.08999999821186067 | Mapping Time: 1.312 sec | Num Simulations: 200
Solution found before finish the MCTS simulations.

Placement and Timing:
[PE | Placed DFG node or routing PE (R) | Scheduled Time]

	[0  | -1   | t = -1 ]  [1  | add5 | t = 3  ]  [2  | add4 | t = 2  ]  [3  | add3 | t = 1  ] 
	[4  | -1   | t = -1 ]  [5  | -1   | t = -1 ]  [6  | add7 | t = 1  ]  [7  | add6 | t = 0  ] 
	[8  | -1   | t = -1 ]  [9  | add2 | t = 3  ]  [10 | add1 | t = 2  ]  [11 | add0 | t = 1  ] 
	[12 | -1   | t = -1 ]  [13 | -1   | t = -1 ]  [14 | -1   | t = -1 ]  [15 | -1   | t = -1 ] 

Routing:
(Source PE, Destination PE) | Routing path passing through PEs: [Source PE, PE_x, ..., PE_y,Destination PE]

	(10, 9)  | [10, 9] 
	(11, 10) | [11, 10]
	(7, 11)  | [7, 11] 
	(7, 6)   | [7, 6]  
	(6, 10)  | [6, 10] 
	(6, 2)   | [6, 2]  
	(7, 3)   | [7, 3]  
	(3, 2)   | [3, 2]  
	(2, 1)   | [2, 1]  

MCTS results: Mean visited rate: 54.993% | Mean visited nodes: 127.500 | 2xMean expanded nodes: 1.892 | Mean max tree depth 7.000 | Number of Backtrackings: 0

-------------------------------------------------- End Mapping V_8_E_9.dot --------------------------------------------------

Workflow

The workflow is simple:

Pre-train a model on synthetics DFGs and save the checkpoint in results/checkpoints/.
Map DFGs in changed_MCTS_benchmark/ with zero-shot and finetune using the pre-trained model.

Scripts Information

The scripts for pre-training and mapping follow this pattern:

<script> <ConfigFile> <ConfigClass> <ARCH_STYLE> <nxm>

Where n and m can be any positive integer values, ARCH_STYLE can be OH_TOR_DIAG, MESH, or ONE_HOP. ConfigFile is the filename of a configuration in configs/ for a compiler, and ConfigClass is the class in this configuration file. The script refers to any script in scripts/.

1. Pre-Training

scripts/train.sh config_smartmap.py ConfigSmartMap MESH 4x4

2. Mapping

Zero-shot

scripts/map_with_zero_shot.sh config_smartmap.py ConfigSmartMap MESH 4x4

Finetune

scripts/map_with_finetune.sh config_smartmap.py ConfigSmartMap MESH 4x4

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
configs		configs
results		results
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
change_benchmark.py		change_benchmark.py
generate_dot_images.py		generate_dot_images.py
map_with_finetune_by_config.py		map_with_finetune_by_config.py
map_with_zero_shot_by_config.py		map_with_zero_shot_by_config.py
model_launcher.py		model_launcher.py
model_launcher_map.py		model_launcher_map.py
rename_graphs.py		rename_graphs.py
requirements.txt		requirements.txt
setup.sh		setup.sh
train_by_config.py		train_by_config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SmartMap: Architecture-Agnostic CGRA Mapping using Graph Traversal and Reinforcement Learning

Summary

Paper

Next steps

Project Structure

Setup

Machine Configuration

Prerequisites

Install the Dependencies

Hyperparameters

Pre-Training

Mapping

Results

Pre-Training

Visualise All Pre-Training Results

Mapping

Example of an Interpretable Mapping

Workflow

Scripts Information

1. Pre-Training

2. Mapping

Zero-shot

Finetune

About

Uh oh!

Releases

Packages

Languages

License

lesc-ufv/SmartMap

Folders and files

Latest commit

History

Repository files navigation

SmartMap: Architecture-Agnostic CGRA Mapping using Graph Traversal and Reinforcement Learning

Summary

Paper

Next steps

Project Structure

Setup

Machine Configuration

Prerequisites

Install the Dependencies

Hyperparameters

Pre-Training

Mapping

Results

Pre-Training

Visualise All Pre-Training Results

Mapping

Example of an Interpretable Mapping

Workflow

Scripts Information

1. Pre-Training

2. Mapping

Zero-shot

Finetune

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages