Strange result of MPI_AllReduce for 16 byte real


Compiler: gfortran-4.8.5

MPI library: OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2)

This program:

use mpi implicit none real*16 :: x integer :: ierr, irank, type16 call MPI_Init(ierr) call MPI_Comm_Rank(MPI_Comm_World, irank, ierr) if (irank+1==1) x = 2.1 if (irank+1==8) x = 2.8 if (irank+1==7) x = 5.2 if (irank+1==4) x = 6.7 if (irank+1==6) x = 6.5 if (irank+1==3) x = 5.7 if (irank+1==2) x = 4.0 if (irank+1==5) x = 6.8 print '(a,i0,a,f3.1)', "rank+1: ",irank+1," x: ",x call MPI_AllReduce(MPI_IN_PLACE, x, 1, MPI_REAL16, MPI_MAX, MPI_Comm_World, ierr) if (irank==0) print '(i0,a,f3.1)', irank+1," max x: ", x call MPI_Finalize(ierr) end

I also tried real(16), real(kind(1.q0)). real(real128) is actually equivalent with real*10 for this compiler.

The result is:

> mpif90 reduce16.f90 > mpirun -n 8 ./a.out rank+1: 1 x: 2.1 rank+1: 2 x: 4.0 rank+1: 3 x: 5.7 rank+1: 4 x: 6.7 rank+1: 5 x: 6.8 rank+1: 6 x: 6.5 rank+1: 7 x: 5.2 rank+1: 8 x: 2.8 1 max x: 2.8

The program finds the true maximum for real*10 keeping MPI_REAL16. The MPI specification (3.1, pages 628 and 674) is not very clear if MPI_REAL16 corresponds to real*16 or real(real128) if these differ.

Also, assuming MPI_REAL16 is actually real(real128) and trying to use that in a program leads to a different problem:

Error: There is no specific subroutine for the generic 'mpi_recv' at (1) Error: There is no specific subroutine for the generic 'mpi_send' at (1)

which does not happen for real*16. (disregarding that one should be able to pass any bit pattern, so this check is superfluous)

What is the right way to use 16 byte reals? Is the OpenMPI library in error?


While this should just work correctly in every MPI implementation, a straightforward workaround is to implement a user-defined reduction for this type that is written in Fortran, so there are no issues with implementing it in C (this is how MPICH and OpenMPI try to do everything, hence there are issues when C cannot reproduce the behavior of Fortran).

Below is an attempt to implement this. This is the user-defined reduction in Fortran. I am certain that experienced modern Fortran programmers can do it better.

subroutine sum_real16(iv,iov,n) implicit none integer, intent(in) :: n real*16, intent(in) :: iv(:) real*16, intent(inout) :: iov(:) integer :: i do i = 1,n iov(i) = iov(i) + iv(i) enddo end subroutine sum_real16 subroutine reduce_sum_real16(iv, iov, n, dt) use, intrinsic :: iso_c_binding, only : c_ptr use mpi_f08 implicit none type(c_ptr), value :: iv, iov integer :: n type(MPI_Datatype) :: dt if ( dt .eq. MPI_REAL16 ) then call sum_real16(iv,iov,n) endif end subroutine reduce_sum_real16 program test_reduce_sum_real16 use, intrinsic :: iso_c_binding use mpi_f08 implicit none integer, parameter :: n = 10 real*16 :: output(n) real*16 :: input(n) real*16 :: error integer :: me, np procedure(MPI_User_function) :: reduce_sum_real16 type(MPI_Op) :: mysum integer :: i call MPI_Init() call MPI_Comm_rank(MPI_COMM_WORLD,me) call MPI_Comm_size(MPI_COMM_WORLD,np) output = 0.0 input = 1.0*me call MPI_Op_create(reduce_sum_real16,.true.,mysum) call MPI_Allreduce(input,output,n,MPI_REAL16,mysum,MPI_COMM_WORLD) error = 0.0 do i = 1,n error = error + (output(i)-1.0*np) enddo if (error.gt.0.0) then print*,'SAD PANDA = ',error call MPI_Abort(MPI_COMM_SELF,1) endif call MPI_Op_free(mysum) call MPI_Finalize() end program test_reduce_sum_real16

This program returns without error with Intel 16 Fortran compiler and MPICH 3.2+. Apparently I am not using I/O correctly though, so my confidence in the correctness of this program is not as high as it would be if I could write all the results to stdout.


  • rgdal installation failed on ubuntu 14.04 [closed]
  • Is overriding an ActiveRecord relation's count() method okay?
  • golang: given a string, output an equivalent golang string literal
  • GLKViewControllerDelegate getting blocked
  • Real-time data transfer from Python to MATLAB
  • Check status of process and restart process if unresponsive/stopped - Python
  • How to change type datatable column [duplicate]
  • How create observable with parameters in Angular 6?
  • Re-order an array's items based on several criteria
  • How to set a dynamic form fields to dirty or touched with angular?
  • Working with django : Proxy setup
  • Getting CKEditor to work with Flask Admin
  • How to use SerialPort class more reliably
  • ASP.NET: replacing UpdatePanel with Jquery?
  • How do methods within a class access class constant variables?
  • Making query to find nearest multiple(Lat,Long) from the single(Lat,Long)
  • spring security manual login best practice
  • Implementation of timeout in LDAP
  • First dynamically-added TinyMCE editor displays, others do not
  • how to add dashed border on highcharts “area” graph for every point
  • VBScript InputBox and Help Files
  • How to use Flask's render_template from an ajax POST form submit
  • Generate and export point cloud from Project Tango
  • When to use the tag in the head and body section of a html page? [duplicate]
  • How to use Kaminari pagination gem with Sinatra and Mongoid?
  • Runtime complexity of getting the length of a string in different representations
  • JavaScript Regex to Match Boundaries of Words with diacritics
  • How to get rgb from transparent pixel in js
  • Computing the discrete fourier transform of audio data with FFTW
  • Using Service Component Runtime
  • How do I use TagLib-Sharp to write custom (PRIV) ID3 frames?
  • PHP Permalinks.. how to change?
  • media foundation H264 decoder not working properly
  • CAS 4 - Not able to retrieve the LDAP groups after successful authentication
  • Running R's aov() mixed effects model from Python using rpy2
  • Access to a Matlab gui from the web