High-Performance Neural Network Implementations

Project Overview

This project implements a sparsely connected Feed-Forward Neural Network (FFNN) using two parallel computing approaches:

OpenMP for multi-core CPU parallelization
CUDA for GPU acceleration

This project is part of the Architecture and Platform for Artificial Intelligence exam.

Network Architecture

The neural network has the following key characteristics:

Input layer size: N neurons
Total layers: K
Connectivity reach: R (each node connects to R previous nodes)
Uses sigmoid activation function
Includes a bias term

Parallelization Strategies

OpenMP Implementation

Parallelizes computation across nodes within each layer
Uses #pragma omp parallel for to distribute workload
Supports multi-threading on shared-memory CPU architectures

CUDA Implementation

Implements two kernel versions:
1. Shared memory kernel
2. Global memory kernel
Uses double-buffering strategy for layer computations
Optimizes memory access patterns for GPU computation

Compilation and Running

OpenMP Version

Compilation

cd openMP
gcc -std=c99 -Wall -Wpedantic -fopenmp -Iinclude src/main.c src/network.c src/utilities.c -o openMP -lm

or

cd openMP
mkdir build
cd build
cmake ..
make

Running

./openMP [N] [K] [machine_output]

N: Number of neurons in the first layer
K: Total number of layers
machine_output: 1 for machine-readable output, 0 for human-readable (default: 1)

CUDA Version

Compilation

cd CUDA
mkdir build
nvcc -Iinclude src/main.cu -o build/main

Running

cd build
./main [N] [K] [machine_output]

Same parameters as OpenMP version

Analysis

Refer to Report.pdf for a more accurate analysis of the performances of both implementations

Requirements

GCC with OpenMP support
NVIDIA CUDA Toolkit
CMake (optional, for alternative build method)

Authors

Luca Tedeschini - University of Bologna

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CUDA		CUDA
images		images
openMP		openMP
profiling_results		profiling_results
.gitattributes		.gitattributes
.gitignore		.gitignore
Assignment-NeuralNet.pdf		Assignment-NeuralNet.pdf
LICENSE		LICENSE
README.md		README.md
README_short.md		README_short.md
Report.pdf		Report.pdf
cuda-profiler.sh		cuda-profiler.sh
openmp-profiler.sh		openmp-profiler.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

High-Performance Neural Network Implementations

Project Overview

Network Architecture

Parallelization Strategies

OpenMP Implementation

CUDA Implementation

Compilation and Running

OpenMP Version

Compilation

Running

CUDA Version

Compilation

Running

Analysis

Requirements

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

LucaTedeschini/APAI_project_work

Folders and files

Latest commit

History

Repository files navigation

High-Performance Neural Network Implementations

Project Overview

Network Architecture

Parallelization Strategies

OpenMP Implementation

CUDA Implementation

Compilation and Running

OpenMP Version

Compilation

Running

CUDA Version

Compilation

Running

Analysis

Requirements

Authors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages