I am writing because I am facing efficiency problem. I can do massive parallel calculations, with a 2000 procs as a maximum and 6h/12h calculation times. Thus, I have to reach the highest efficiency as possible for my calculations.
Here is an example of automatic distribution of processors according to paral_kgb -2000 (ngkpt:2x2x2; nband:1800). (Version 7.4.2 of Abinit)
Code: Select all
npimage| npkpt| npspinor| npfft| npband| bandpp | nproc| weight|
1 -> 1| 1 -> 4| 1 -> 1| 1 -> 68| 1 -> 1800| 1 -> 180| 2 -> 2000| 1 -> 2000|
1| 3| 1| 25| 25| 1| 1875| 1247.93 |
1| 3| 1| 25| 25| 2| 1875| 1194.18 |
1| 2| 1| 30| 30| 1| 1800| 1169.85 |
1| 2| 1| 32| 30| 1| 1920| 1160.33 |
1| 3| 1| 24| 24| 1| 1728| 1158.64 |
1| 3| 1| 24| 25| 1| 1800| 1154.12 |
1| 3| 1| 25| 24| 1| 1800| 1153.79 |
1| 3| 1| 27| 24| 1| 1944| 1148.80 |
1| 2| 1| 30| 30| 2| 1800| 1108.69 |
As you can see, efficiency (weight) only reaches 1247 and is always far to match with the number of processors.
Same calculation with the version 7.0.3 of Abinit provides:
Code: Select all
For dataset= 1 a possible choice for less than 2000 processors is:
nproc npkpt npspinor npband npfft bandpp weight
1872 4 1 36 13 2 0.50
1824 4 1 24 19 1 0.25
1800 4 1 225 2 4 28.00
1800 4 1 450 1 4 112.50
1800 4 1 25 18 4 0.25
1800 4 1 30 15 4 0.50
1800 4 1 45 10 4 1.00
1800 4 1 50 9 4 1.25
1800 4 1 75 6 4 3.00
1800 4 1 90 5 4 4.50
Here, as you can see, the efficiency is totally different with a good distribution of 1800 processors (weight should match 1 in previous version of abinit).
Maybe the parallel version of abinit-7.4.2 improves the efficiency of processor distribution, but the way processors are distributed is now difficult to understand.
In abinit-7.0.3, distribution is following the rules:
- distribution over k-points first
- Then distribution over bands and fft grid with best efficiency when npband > 4xnpfft
As you can see these rules are no longer available with abinit-7.4.2. The problem is I do not know if an efficiency of 1248 for 1875 processors is acceptable or not. And if it is not, since processors distribution rules changed, I do not know how to change nband (I think it is the only parameter I can play with) in order to improve efficiency.
All advices are welcome,
Best regards,
Emile