Tuesday, August 3, 2010

MPI_ERRORS_ARE_FATAL for ARPACK running OpenMPI on CentOS 4

One of our researcher was using ARPACK and OpenMPI and was facing an issue. He faced this problem.

mpirun -np 1 pssdrv1_intel_em64t [xxx.edu.sg:19479] *** An error occurred in MPI_Allreduce [xxx.edu.sg:19479] *** on communicator MPI_COMM_WORLD [xxx.edu.sg:19479] *** MPI_ERR_OP: invalid reduce operation [xxx.edu.sg:19479] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 19479 on node xxx.edu.sg exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

He managed to solve the problem by removing mpif.h from the source code and assuming ifort is already compiled with OpenMPI, the package should work. This is because PARPACK is relative established package for parallel iterative matrix diagonlization and was developed based on early MPI standard

I have not verified the solution but we have a happy scientist......

No comments: