I. Experimental environment construction
This MPI cluster environment is configured on visual studio 2022 after installing the sdk and application for mpi in the computer.
- VC++ directory --- "Include directory --- Add MPI's include directory
- VC++ directory --- "Library directory --- Add x64 directory for MPI
- VC++ directory --- "Pre-compiler ---" Enter "MPICH_SKIP_MPICXX" and click "Confirm".
- VC++ Catalog --- "Code Generation" --- "Run Library" select "Multi-Threaded Debugging (/MTd)".
- Linker---"Input---"The names of the three files inside the x64.
- Once you have completed the above configuration just click on Generate Solution.
II. MPI Program Code
#include <>
#include <>
#include <> `
// Calculate the sum from start to end.
long long sum(int start, int end) {
long long result = 0; for (int i = start; i
for (int i = start; i <= end; ++i) {
result += i; }
}
return result; }
}
int main(int argc, char* argv[]) {
int rank, size; int long long total_sum = 0; int rank, size; int rank, size
long long total_sum = 0; int start, end; int rank
int start, end; int start_time, end_time; double start_time, end_time; int start_time, end_time
double start_time, end_time; int start_time, end_time; int start_time, end_time; int start_time, end_time
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank); //communication domain, typically MPI_COMM_WORLD returns the process number rank of the process executing the current code
MPI_Comm_size(MPI_COMM_WORLD, &size); //communication domain, typically MPI_COMM_WORLD returns the total number of processes size
// Calculate the working range of each process
int total_numbers = 10000;
int chunk_size = total_numbers / size; // branch size
int remainder = total_numbers % size; // branch size; //Calculate the working range of each process.
if (rank < remainder) {
// the first remainder process handles one more number
start = rank * (chunk_size + 1) + 1; end = start + chunk_size + 1; // branch size
end = start + chunk_size; }
}
else {
// The remaining processes
start = rank * chunk_size + remainder + 1; end = start + chunk_size - 1; } else { // remainder processes
end = start + chunk_size - 1; } else { // remaining processes start = rank * chunk_size + remainder + 1; } else {
}
// Start time
start_time = MPI_Wtime();
// Calculate the partial sum
long long partial_sum = sum(start, end); // Calculate partial sum
// end time
end_time = MPI_Wtime();
// Perform barrier synchronization in the specified communicator, ensuring that all processes are synchronized at this point until all processes reach the barrier.
MPI_Barrier(MPI_COMM_WORLD).
// Output the computation time and range for each process
printf("Process %d computed sum from %d to %d in %.6f seconds\n", rank, start, end, end_time - start_time);
// Function MPI_Reduce(), the same variable of each process within the communication sub-process to participate in the statute calculation, and output the calculation results to the specified process
// aggregates partial sums to process 0
MPI_Reduce(&partial_sum, &total_sum, 1, MPI_LONG_LONG, MPI_SUM, 0, MPI_COMM_WORLD);;
// Process 0 outputs the sum
if (rank == 0) {
printf("Total sum from 1 to %d is %lld\n", total_numbers, total_sum);
printf("Number of processes used: %d\n", size);
}
MPI_Finalize();
return 0; }
}
Cluster communication function: use the MPI_Reduce() function to aggregate the local sum of each process to process rank0.
Synchronization Functions: Use the MPI_Barrier() synchronization function to ensure that all processes stay at the same point in time until all processes have reached that point. It can be used to ensure that all processes have completed a certain phase of work before continuing execution.
Time function: Use the MPI_Wtime() function to return a highly accurate timestamp that measures the time interval between each process.
III. Screenshots and analysis of operation results
1. The result of the experiment shows 1+2+......10000=50005000, i.e., the result of the calculation is correct.
2. The use of the MPI_Barrir() function ensures that all processes are executed after they have been called, guaranteeing synchronization of the communicating processes.
By optimizing the message passing mechanism, efficient communication between nodes is achieved, thus significantly improving the efficiency of parallel computing.