Page 1 of 1

[Q]parallel run with gamma point calculations

Posted: Wed Dec 28, 2011 9:07 pm
by bjeon
Hi, I am testing surface problems with large vacuum space.
Using gamma point (kpt=0,0,0,) run, I am trying to run parallel run calculations as:

# for 2-CPU run
paral_kgb 1
npkpt 1
npband 2
npfft 1
bandpp 1

Because there is a single K-point, npkpt is given as 1. I am trying to configure npband or npfft. But this run shows slower results than a single CPU run (77 -> 103 sec for 10 SCF loops). A test problem has 58 bands and 40x40x160 FFT grids.

Are options right? Or the tested problem is too small for parallel run check? I tried different npband/npfft with different number of CPUs but still results are not good (getting worse for more CPUs). Any comment will be very appreciated.

ByoungSeon

Re: [Q]parallel run with gamma point calculations

Posted: Mon Jan 02, 2012 3:14 pm
by nleconte
When doing parallelisation over bands, make sure the communications between CPU's are efficient or have it work on the same node to share memory. I for instance had a very bad efficiency (almost factor of then) when doing a job requiring 8 CPU's when it was being distributed on different nodes instead of one same node...

Re: [Q]parallel run with gamma point calculations

Posted: Sun Jan 22, 2012 4:10 pm
by mverstra
You are probably right that the problem is small. Hard to see speed up on 2 procs as well. Above all, the execution time is small, so you will have a large fraction of system time (start up, file i/o etc...) and some pollution if there are other processes around.

Have you checked what Nicolas mentioned, that the processes truly are being distributed by your mpi implementation and batch system?

Matthieu

Re: [Q]parallel run with gamma point calculations

Posted: Tue Jan 24, 2012 2:55 am
by bjeon
Hi.

Thanks for the comments and help.
Yes, I have been testing the jobs on SMP or multiple-core machines, not distributed environment.
But the still efficiency is not good. Or do I need any other compiling options? I used a following command for configuration.

./configure --enable-mpi --with-mpi-level=2 --prefix=/home/aaa/abinit FC=/opt/ompi143/bin/mpif90

Do I have to configure openMP or Pthreads?

As mentioned before, parallel run is very good on multiple K-points. Problem is that parallel calculations on a gamma point
calculation (a single K-point) is not good.

Best regards,

ByoungSeon

Re: [Q]parallel run with gamma point calculations

Posted: Tue May 01, 2012 11:33 pm
by bjeon
I leave some observations regarding gamma point parallelism.

1. For Xeon desktop (6corexhyperthread=12 threads) with intel fortran compiler didn't yield (?) any speed up in gamma point parallelism.

2. For AMD Opteron(tm) Processor 6176, 48 cores, with open64 5.0, 45x45x216 ngfft size problem, parallel option is given as:
paral_kgb 1
npkpt 1
npband 8 #---> change as 1, 2, 4, 8
nband 120
npfft 1
bandpp 1

20 SCF loops are done and the wall times are:
1cpu = 908.4 sec
2cpu = 743.4sec
4cpu = 385.6sec
8cpu = 245.1sec

There might be some overheads but gamma point parallelism seems working good. But also hw/compiler might be double-checked.

B.