🚀 MPI Parallel Computing Research

A comprehensive research project demonstrating advanced Message Passing Interface (MPI) programming concepts through practical examples and performance benchmarking.

🎯 Overview

This project provides a complete learning path for MPI parallel programming, from basic concepts to advanced performance optimization. It includes real-world algorithms, comprehensive benchmarking tools, and detailed performance analysis.

⚡ Key Features

📚 Progressive Learning Path: Basic → Intermediate → Advanced examples
🔬 Real Performance Data: Actual GFLOPS measurements and scalability analysis
🛠️ Professional Tooling: Comprehensive Makefile with automated testing
📊 Benchmarking Suite: Automated performance analysis across multiple process counts
🐍 Multi-Language: Both C and Python implementations
📖 Complete Documentation: From quick start to research findings

🚀 Quick Start

Prerequisites

# macOS
brew install open-mpi
pip3 install mpi4py

# Ubuntu/Debian
sudo apt-get install libopenmpi-dev openmpi-bin
pip3 install mpi4py

# CentOS/RHEL
sudo yum install openmpi-devel
# Load MPI module: module load mpi/openmpi-x86_64
pip3 install mpi4py

Installation

git clone https://github.com/your-username/mpi-parallel-computing.git
cd mpi-parallel-computing

# Install dependencies (macOS)
make install-deps

# Run demonstration
make demo

First Run

# Compile all examples
make all

# Run hello world with 4 processes
make run-hello

# Run performance benchmarks
make benchmark

📊 Performance Results

Monte Carlo Pi Estimation Scalability

Processes	Samples/Second	Speedup	Efficiency
1	131M	1.0x	100%
2	265M	2.0x	100%
4	497M	3.8x	95%
8	950M	7.2x	90%

Matrix Multiplication Performance

800×800 matrices: 8.31 GFLOPS on 4 processes
Memory efficiency: Distributed across processes
Near-linear scaling: 90%+ efficiency up to 8 processes

🏗️ Project Structure

mpi-parallel-computing/
├── examples/
│   ├── basic/                    # 🌱 Foundational concepts
│   │   ├── hello_world_mpi.c     # Process identification
│   │   ├── hello_world_mpi.py    # Python MPI basics
│   │   └── parallel_sum_mpi.c    # Collective operations
│   ├── intermediate/             # 🚀 Advanced algorithms
│   │   ├── matrix_multiply_mpi.c # 2D data distribution
│   │   └── monte_carlo_pi_mpi.c  # Statistical simulation
│   └── advanced/                 # 🧠 Complex patterns
├── benchmarks/                   # 📈 Performance testing
│   ├── performance_test.sh       # Automated benchmarking
│   └── results/                  # Benchmark outputs
├── docs/                         # 📚 Documentation
│   └── RESEARCH_OVERVIEW.md      # Detailed findings
└── Makefile                      # 🛠️ Build automation

💻 Examples

Hello World MPI

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    
    int world_rank, world_size;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
    printf("Hello from process %d of %d\n", world_rank, world_size);
    
    MPI_Finalize();
    return 0;
}

mpirun -np 4 examples/basic/hello_world_mpi

Monte Carlo Pi Estimation

# Run with 8 processes, 100 million samples
mpirun -np 8 examples/intermediate/monte_carlo_pi_mpi 100000000

Output:

=== MPI Monte Carlo Pi Estimation ===
Total samples: 100000000
Processes: 8
π estimate: 3.1415923847
Accuracy: 99.99992%
Performance: 950M samples/second

🛠️ Available Commands

Compilation

make all              # Compile all examples
make basic            # Basic examples only  
make intermediate     # Intermediate examples only
make advanced         # Advanced examples only

Execution

make demo             # Quick demonstration
make test             # Functionality tests
make benchmark        # Comprehensive benchmarks
make scale-test       # Scalability analysis

Individual Examples

make run-hello        # Hello world example
make run-sum          # Parallel sum
make run-matrix       # Matrix multiplication  
make run-monte-carlo  # Monte Carlo simulation
make run-python       # Python MPI example

Utilities

make status           # Project status
make clean            # Remove compiled files
make help             # Show all commands

🔬 Research Findings

Algorithm Performance Analysis

Monte Carlo Method

Convergence: O(1/√n) theoretical rate
Parallel efficiency: Embarrassingly parallel
Scaling: Near-linear up to 8 processes
Statistical accuracy: Process-independent random seeds

Matrix Multiplication

Computational complexity: O(n³) for n×n matrices
Memory distribution: Row-wise partitioning
Communication pattern: Scatter-broadcast-gather
Performance: 8.31 GFLOPS sustained throughput

Optimization Insights

Process count: Optimal = CPU core count
Memory layout: Contiguous access patterns crucial
Communication: Batch operations minimize overhead
Load balancing: Even distribution prevents bottlenecks

🧪 Testing

# Run all tests
make test

# Individual test suites
make test-basic
make test-intermediate  
make test-advanced

# Performance regression testing
make benchmark

📈 Benchmarking

The project includes comprehensive benchmarking tools:

# Full benchmark suite
make benchmark

# Results stored in benchmarks/results/
ls benchmarks/results/benchmark_*.txt

Benchmarks include:

Scalability testing across process counts
Performance metrics (GFLOPS, samples/sec)
System profiling (CPU, memory usage)
Efficiency analysis (speedup, parallel efficiency)

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

New algorithms: Additional parallel examples
Platform support: Windows, other Linux distros
Performance optimization: Algorithm improvements
Documentation: Tutorials, explanations
Testing: Additional test cases

Quick Contribution Guide

Fork the repository
Create a feature branch
Add your example in appropriate directory
Update documentation
Add tests and benchmarks
Submit pull request

📚 Learning Resources

MPI Concepts Covered

Process Management: Initialization, ranks, communicators
Point-to-Point: Send, receive, synchronization
Collective Operations: Broadcast, scatter, gather, reduce
Performance: Timing, profiling, optimization
Advanced Patterns: Master-worker, SPMD, pipeline

Educational Use

Perfect for:

Computer Science courses on parallel programming
Research projects requiring MPI implementation
Performance analysis and optimization studies
HPC training and skill development

🏆 Performance Achievements

7.2x speedup with 8 processes (Monte Carlo)
8.31 GFLOPS sustained (Matrix multiplication)
90% parallel efficiency at scale
950M samples/second throughput

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Open MPI Community for excellent MPI implementation
HPC Research Community for algorithmic insights
Contributors who help improve this project

📞 Contact & Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: kapilnema27@gmail.com

⭐ Star this repo if you find it useful for learning MPI!

📚 Check out the research documentation for detailed analysis and findings.

🚀 Ready to learn parallel programming? Start with make demo!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
benchmarks		benchmarks
docs		docs
examples		examples
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_MPI.md		README_MPI.md

License

kapil27/mpi-parallel-computing

Folders and files

Latest commit

History

Repository files navigation