calculate_unbalanced_decomposition Subroutine

private subroutine calculate_unbalanced_decomposition(k, j, i, numsmall, numlarge, localprocs)

This subroutine (calculate_unbalanced_decomposition) is used in the code to create unbalanced xxf and yxf layouts to address 1 communication overheads in the redistribution of data for the FFTs used in the nonlinear calculations to move from Fourier space to real space and back again.

k is an integer input which is the factor that needs to be decomposed across the number of processes (the input integer localprocs) that we have at this point in the decomposition.

When this subroutine is called k is larger than localprocs (this is enforced through the if and else statements in the code that calls this routine). It is also only called when localprocs does not completely/equally divide k (so k/localprocs will not given a round number).

The subroutine returns four integers; i, j, numlarge, and numsmall

j is calculated as the integer division of k/localprocs. As this is an integer division the result is rounded down (remembering that k/localprocs will not be an equal division). j is used by the small block factor (the relative size of the small block).

i is the large block factor. As j is the rounded down division, i = j + 1. This gives two different block sizes (when these are later calculated).

numlarge is used to work out how many processes will use the large block size in the decomposition.

numsmall is used to work out how many processes will use the small block size in the decomposition.

AJ, November 2011: New code from DCSE project

Arguments

Type IntentOptional Attributes Name
integer, intent(in) :: k
integer, intent(out) :: j
integer, intent(out) :: i
integer, intent(out) :: numsmall
integer, intent(out) :: numlarge
integer, intent(in) :: localprocs

Contents


Source Code

  subroutine calculate_unbalanced_decomposition(k, j, i, numsmall, numlarge, localprocs)
    implicit none

    integer, intent(in) :: k, localprocs
    integer, intent(out) :: i, j, numlarge, numsmall

    ! To calculate the number of processes to give the large block size
    ! we work out the remainder left from dividing the input factor
    ! but the processes we have left.  This remainder is proportionate
    ! to the difference in the two block sizes.  For example, if k = 62
    ! and localprocs = 3 the code will work out i to be 21 and j to be
    ! 20, although k/localprocs is actually equal to 20 and 2/3.
    ! The code will also work out numlarge to be 2 and numsmall to
    ! be 1, so of the total processes in the simulatino 1/3 will have
    ! the small block and 2/3 the large block.  This maps to the 2/3
    ! in the actual division as opposed to the integer division (i.e
    ! the factor we have removed and replaced with small and large
    ! sizes, so replacing 20 2/3 by 1/3 x 20 and 2/3 * 21.
    numlarge = mod(k,localprocs)
    j  = k/localprocs
    i = j + 1  
    numsmall = localprocs - numlarge

  end subroutine calculate_unbalanced_decomposition