processor distribution rules

Total energy, geometry optimization, DFT+U, spin....

Moderator: bguster

Locked
Emile
Posts: 15
Joined: Tue Dec 04, 2012 12:54 pm

processor distribution rules

Post by Emile » Wed Nov 13, 2013 3:00 pm

Dear users,

I am writing because I am facing efficiency problem. I can do massive parallel calculations, with a 2000 procs as a maximum and 6h/12h calculation times. Thus, I have to reach the highest efficiency as possible for my calculations.

Here is an example of automatic distribution of processors according to paral_kgb -2000 (ngkpt:2x2x2; nband:1800). (Version 7.4.2 of Abinit)

Code: Select all

     npimage|       npkpt|    npspinor|       npfft|      npband|     bandpp |       nproc|      weight|
   1 ->    1|   1 ->    4|   1 ->    1|   1 ->   68|   1 -> 1800|   1 ->  180|   2 -> 2000|   1 -> 2000|
           1|           3|           1|          25|          25|           1|        1875|    1247.93 |
           1|           3|           1|          25|          25|           2|        1875|    1194.18 |
           1|           2|           1|          30|          30|           1|        1800|    1169.85 |
           1|           2|           1|          32|          30|           1|        1920|    1160.33 |
           1|           3|           1|          24|          24|           1|        1728|    1158.64 |
           1|           3|           1|          24|          25|           1|        1800|    1154.12 |
           1|           3|           1|          25|          24|           1|        1800|    1153.79 |
           1|           3|           1|          27|          24|           1|        1944|    1148.80 |
           1|           2|           1|          30|          30|           2|        1800|    1108.69 |


As you can see, efficiency (weight) only reaches 1247 and is always far to match with the number of processors.

Same calculation with the version 7.0.3 of Abinit provides:

Code: Select all

 For dataset=   1  a possible choice for less than 2000 processors is:
  nproc     npkpt  npspinor    npband     npfft    bandpp    weight
  1872       4         1        36        13         2        0.50
  1824       4         1        24        19         1        0.25
  1800       4         1       225         2         4       28.00
  1800       4         1       450         1         4      112.50
  1800       4         1        25        18         4        0.25
  1800       4         1        30        15         4        0.50
  1800       4         1        45        10         4        1.00
  1800       4         1        50         9         4        1.25
  1800       4         1        75         6         4        3.00
  1800       4         1        90         5         4        4.50


Here, as you can see, the efficiency is totally different with a good distribution of 1800 processors (weight should match 1 in previous version of abinit).

Maybe the parallel version of abinit-7.4.2 improves the efficiency of processor distribution, but the way processors are distributed is now difficult to understand.

In abinit-7.0.3, distribution is following the rules:
- distribution over k-points first
- Then distribution over bands and fft grid with best efficiency when npband > 4xnpfft

As you can see these rules are no longer available with abinit-7.4.2. The problem is I do not know if an efficiency of 1248 for 1875 processors is acceptable or not. And if it is not, since processors distribution rules changed, I do not know how to change nband (I think it is the only parameter I can play with) in order to improve efficiency.

All advices are welcome,
Best regards,

Emile

Locked