Skip to content

kapil27/mpi-parallel-computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ MPI Parallel Computing Research

License: MIT MPI C Standard Python Platform

A comprehensive research project demonstrating advanced Message Passing Interface (MPI) programming concepts through practical examples and performance benchmarking.

🎯 Overview

This project provides a complete learning path for MPI parallel programming, from basic concepts to advanced performance optimization. It includes real-world algorithms, comprehensive benchmarking tools, and detailed performance analysis.

⚑ Key Features

  • πŸ“š Progressive Learning Path: Basic β†’ Intermediate β†’ Advanced examples
  • πŸ”¬ Real Performance Data: Actual GFLOPS measurements and scalability analysis
  • πŸ› οΈ Professional Tooling: Comprehensive Makefile with automated testing
  • πŸ“Š Benchmarking Suite: Automated performance analysis across multiple process counts
  • 🐍 Multi-Language: Both C and Python implementations
  • πŸ“– Complete Documentation: From quick start to research findings

πŸš€ Quick Start

Prerequisites

# macOS
brew install open-mpi
pip3 install mpi4py

# Ubuntu/Debian
sudo apt-get install libopenmpi-dev openmpi-bin
pip3 install mpi4py

# CentOS/RHEL
sudo yum install openmpi-devel
# Load MPI module: module load mpi/openmpi-x86_64
pip3 install mpi4py

Installation

git clone https://github.com/your-username/mpi-parallel-computing.git
cd mpi-parallel-computing

# Install dependencies (macOS)
make install-deps

# Run demonstration
make demo

First Run

# Compile all examples
make all

# Run hello world with 4 processes
make run-hello

# Run performance benchmarks
make benchmark

πŸ“Š Performance Results

Monte Carlo Pi Estimation Scalability

Processes Samples/Second Speedup Efficiency
1 131M 1.0x 100%
2 265M 2.0x 100%
4 497M 3.8x 95%
8 950M 7.2x 90%

Matrix Multiplication Performance

  • 800Γ—800 matrices: 8.31 GFLOPS on 4 processes
  • Memory efficiency: Distributed across processes
  • Near-linear scaling: 90%+ efficiency up to 8 processes

πŸ—οΈ Project Structure

mpi-parallel-computing/
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ basic/                    # 🌱 Foundational concepts
β”‚   β”‚   β”œβ”€β”€ hello_world_mpi.c     # Process identification
β”‚   β”‚   β”œβ”€β”€ hello_world_mpi.py    # Python MPI basics
β”‚   β”‚   └── parallel_sum_mpi.c    # Collective operations
β”‚   β”œβ”€β”€ intermediate/             # πŸš€ Advanced algorithms
β”‚   β”‚   β”œβ”€β”€ matrix_multiply_mpi.c # 2D data distribution
β”‚   β”‚   └── monte_carlo_pi_mpi.c  # Statistical simulation
β”‚   └── advanced/                 # 🧠 Complex patterns
β”œβ”€β”€ benchmarks/                   # πŸ“ˆ Performance testing
β”‚   β”œβ”€β”€ performance_test.sh       # Automated benchmarking
β”‚   └── results/                  # Benchmark outputs
β”œβ”€β”€ docs/                         # πŸ“š Documentation
β”‚   └── RESEARCH_OVERVIEW.md      # Detailed findings
└── Makefile                      # πŸ› οΈ Build automation

πŸ’» Examples

Hello World MPI

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    
    int world_rank, world_size;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
    printf("Hello from process %d of %d\n", world_rank, world_size);
    
    MPI_Finalize();
    return 0;
}
mpirun -np 4 examples/basic/hello_world_mpi

Monte Carlo Pi Estimation

# Run with 8 processes, 100 million samples
mpirun -np 8 examples/intermediate/monte_carlo_pi_mpi 100000000

Output:

=== MPI Monte Carlo Pi Estimation ===
Total samples: 100000000
Processes: 8
Ο€ estimate: 3.1415923847
Accuracy: 99.99992%
Performance: 950M samples/second

πŸ› οΈ Available Commands

Compilation

make all              # Compile all examples
make basic            # Basic examples only  
make intermediate     # Intermediate examples only
make advanced         # Advanced examples only

Execution

make demo             # Quick demonstration
make test             # Functionality tests
make benchmark        # Comprehensive benchmarks
make scale-test       # Scalability analysis

Individual Examples

make run-hello        # Hello world example
make run-sum          # Parallel sum
make run-matrix       # Matrix multiplication  
make run-monte-carlo  # Monte Carlo simulation
make run-python       # Python MPI example

Utilities

make status           # Project status
make clean            # Remove compiled files
make help             # Show all commands

πŸ”¬ Research Findings

Algorithm Performance Analysis

Monte Carlo Method

  • Convergence: O(1/√n) theoretical rate
  • Parallel efficiency: Embarrassingly parallel
  • Scaling: Near-linear up to 8 processes
  • Statistical accuracy: Process-independent random seeds

Matrix Multiplication

  • Computational complexity: O(nΒ³) for nΓ—n matrices
  • Memory distribution: Row-wise partitioning
  • Communication pattern: Scatter-broadcast-gather
  • Performance: 8.31 GFLOPS sustained throughput

Optimization Insights

  1. Process count: Optimal = CPU core count
  2. Memory layout: Contiguous access patterns crucial
  3. Communication: Batch operations minimize overhead
  4. Load balancing: Even distribution prevents bottlenecks

πŸ§ͺ Testing

# Run all tests
make test

# Individual test suites
make test-basic
make test-intermediate  
make test-advanced

# Performance regression testing
make benchmark

πŸ“ˆ Benchmarking

The project includes comprehensive benchmarking tools:

# Full benchmark suite
make benchmark

# Results stored in benchmarks/results/
ls benchmarks/results/benchmark_*.txt

Benchmarks include:

  • Scalability testing across process counts
  • Performance metrics (GFLOPS, samples/sec)
  • System profiling (CPU, memory usage)
  • Efficiency analysis (speedup, parallel efficiency)

🀝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

  • New algorithms: Additional parallel examples
  • Platform support: Windows, other Linux distros
  • Performance optimization: Algorithm improvements
  • Documentation: Tutorials, explanations
  • Testing: Additional test cases

Quick Contribution Guide

  1. Fork the repository
  2. Create a feature branch
  3. Add your example in appropriate directory
  4. Update documentation
  5. Add tests and benchmarks
  6. Submit pull request

πŸ“š Learning Resources

MPI Concepts Covered

  • Process Management: Initialization, ranks, communicators
  • Point-to-Point: Send, receive, synchronization
  • Collective Operations: Broadcast, scatter, gather, reduce
  • Performance: Timing, profiling, optimization
  • Advanced Patterns: Master-worker, SPMD, pipeline

Educational Use

Perfect for:

  • Computer Science courses on parallel programming
  • Research projects requiring MPI implementation
  • Performance analysis and optimization studies
  • HPC training and skill development

πŸ† Performance Achievements

  • 7.2x speedup with 8 processes (Monte Carlo)
  • 8.31 GFLOPS sustained (Matrix multiplication)
  • 90% parallel efficiency at scale
  • 950M samples/second throughput

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Open MPI Community for excellent MPI implementation
  • HPC Research Community for algorithmic insights
  • Contributors who help improve this project

πŸ“ž Contact & Support


⭐ Star this repo if you find it useful for learning MPI!

πŸ“š Check out the research documentation for detailed analysis and findings.

πŸš€ Ready to learn parallel programming? Start with make demo!

About

Advanced MPI parallel programming with real performance analysis

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published