A comprehensive research project demonstrating advanced Message Passing Interface (MPI) programming concepts through practical examples and performance benchmarking.
This project provides a complete learning path for MPI parallel programming, from basic concepts to advanced performance optimization. It includes real-world algorithms, comprehensive benchmarking tools, and detailed performance analysis.
- π Progressive Learning Path: Basic β Intermediate β Advanced examples
- π¬ Real Performance Data: Actual GFLOPS measurements and scalability analysis
- π οΈ Professional Tooling: Comprehensive Makefile with automated testing
- π Benchmarking Suite: Automated performance analysis across multiple process counts
- π Multi-Language: Both C and Python implementations
- π Complete Documentation: From quick start to research findings
# macOS
brew install open-mpi
pip3 install mpi4py
# Ubuntu/Debian
sudo apt-get install libopenmpi-dev openmpi-bin
pip3 install mpi4py
# CentOS/RHEL
sudo yum install openmpi-devel
# Load MPI module: module load mpi/openmpi-x86_64
pip3 install mpi4pygit clone https://github.com/your-username/mpi-parallel-computing.git
cd mpi-parallel-computing
# Install dependencies (macOS)
make install-deps
# Run demonstration
make demo# Compile all examples
make all
# Run hello world with 4 processes
make run-hello
# Run performance benchmarks
make benchmark| Processes | Samples/Second | Speedup | Efficiency |
|---|---|---|---|
| 1 | 131M | 1.0x | 100% |
| 2 | 265M | 2.0x | 100% |
| 4 | 497M | 3.8x | 95% |
| 8 | 950M | 7.2x | 90% |
- 800Γ800 matrices: 8.31 GFLOPS on 4 processes
- Memory efficiency: Distributed across processes
- Near-linear scaling: 90%+ efficiency up to 8 processes
mpi-parallel-computing/
βββ examples/
β βββ basic/ # π± Foundational concepts
β β βββ hello_world_mpi.c # Process identification
β β βββ hello_world_mpi.py # Python MPI basics
β β βββ parallel_sum_mpi.c # Collective operations
β βββ intermediate/ # π Advanced algorithms
β β βββ matrix_multiply_mpi.c # 2D data distribution
β β βββ monte_carlo_pi_mpi.c # Statistical simulation
β βββ advanced/ # π§ Complex patterns
βββ benchmarks/ # π Performance testing
β βββ performance_test.sh # Automated benchmarking
β βββ results/ # Benchmark outputs
βββ docs/ # π Documentation
β βββ RESEARCH_OVERVIEW.md # Detailed findings
βββ Makefile # π οΈ Build automation
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int world_rank, world_size;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
printf("Hello from process %d of %d\n", world_rank, world_size);
MPI_Finalize();
return 0;
}mpirun -np 4 examples/basic/hello_world_mpi# Run with 8 processes, 100 million samples
mpirun -np 8 examples/intermediate/monte_carlo_pi_mpi 100000000Output:
=== MPI Monte Carlo Pi Estimation ===
Total samples: 100000000
Processes: 8
Ο estimate: 3.1415923847
Accuracy: 99.99992%
Performance: 950M samples/second
make all # Compile all examples
make basic # Basic examples only
make intermediate # Intermediate examples only
make advanced # Advanced examples onlymake demo # Quick demonstration
make test # Functionality tests
make benchmark # Comprehensive benchmarks
make scale-test # Scalability analysismake run-hello # Hello world example
make run-sum # Parallel sum
make run-matrix # Matrix multiplication
make run-monte-carlo # Monte Carlo simulation
make run-python # Python MPI examplemake status # Project status
make clean # Remove compiled files
make help # Show all commands- Convergence: O(1/βn) theoretical rate
- Parallel efficiency: Embarrassingly parallel
- Scaling: Near-linear up to 8 processes
- Statistical accuracy: Process-independent random seeds
- Computational complexity: O(nΒ³) for nΓn matrices
- Memory distribution: Row-wise partitioning
- Communication pattern: Scatter-broadcast-gather
- Performance: 8.31 GFLOPS sustained throughput
- Process count: Optimal = CPU core count
- Memory layout: Contiguous access patterns crucial
- Communication: Batch operations minimize overhead
- Load balancing: Even distribution prevents bottlenecks
# Run all tests
make test
# Individual test suites
make test-basic
make test-intermediate
make test-advanced
# Performance regression testing
make benchmarkThe project includes comprehensive benchmarking tools:
# Full benchmark suite
make benchmark
# Results stored in benchmarks/results/
ls benchmarks/results/benchmark_*.txtBenchmarks include:
- Scalability testing across process counts
- Performance metrics (GFLOPS, samples/sec)
- System profiling (CPU, memory usage)
- Efficiency analysis (speedup, parallel efficiency)
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- New algorithms: Additional parallel examples
- Platform support: Windows, other Linux distros
- Performance optimization: Algorithm improvements
- Documentation: Tutorials, explanations
- Testing: Additional test cases
- Fork the repository
- Create a feature branch
- Add your example in appropriate directory
- Update documentation
- Add tests and benchmarks
- Submit pull request
- Process Management: Initialization, ranks, communicators
- Point-to-Point: Send, receive, synchronization
- Collective Operations: Broadcast, scatter, gather, reduce
- Performance: Timing, profiling, optimization
- Advanced Patterns: Master-worker, SPMD, pipeline
Perfect for:
- Computer Science courses on parallel programming
- Research projects requiring MPI implementation
- Performance analysis and optimization studies
- HPC training and skill development
- 7.2x speedup with 8 processes (Monte Carlo)
- 8.31 GFLOPS sustained (Matrix multiplication)
- 90% parallel efficiency at scale
- 950M samples/second throughput
This project is licensed under the MIT License - see the LICENSE file for details.
- Open MPI Community for excellent MPI implementation
- HPC Research Community for algorithmic insights
- Contributors who help improve this project
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: kapilnema27@gmail.com
β Star this repo if you find it useful for learning MPI!
π Check out the research documentation for detailed analysis and findings.
π Ready to learn parallel programming? Start with make demo!