running abinit on Blue Gene

option, parallelism,...

Moderators: fgoudreault, mcote

Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Locked
xiangpisai
Posts: 19
Joined: Thu Mar 24, 2011 3:02 pm

running abinit on Blue Gene

Post by xiangpisai » Wed Mar 30, 2011 6:26 pm

Hi everyone,
I just compiled an abinit on Blue Gene. But the minimum nodes allowed on bluegene is 32 cpus, which is way too many for some small calculations. What's more, I found I cannot run tgw1_3.in located in the tutorial folder. I found it can only run at one single cpu. So is there any way to specify how many cpus abinit uses? I mean, if the system gave me, like, 512 cpus, can I reduce them, by using abinit command, to, say, 32 cpus? Thank you very much for your help!
Sincerely,
xiangpisai

Boris
Posts: 128
Joined: Tue Feb 16, 2010 10:13 am
Location: France

Re: running abinit on Blue Gene

Post by Boris » Wed Mar 30, 2011 6:41 pm

Hi

I'm not sure I understand your question, but abinit uses the number of cpus that you are providing in the script file when you submit your job on BG.

For instance, when submitting your job on BG, if you specify 4 nodes, with 8 processors on each node, then abinit will use 32 procs.

Did I answer your question?

Boris
----------------------------------------------------------
Boris Dorado
Atomic Energy Commission
France
----------------------------------------------------------

User avatar
Alain_Jacques
Posts: 279
Joined: Sat Aug 15, 2009 9:34 pm
Location: Université catholique de Louvain - Belgium

Re: running abinit on Blue Gene

Post by Alain_Jacques » Wed Mar 30, 2011 11:58 pm

Hello xiangsipai,

As Boris said, there are variables in Abinit that control how the different parallelization schemes work; look in http://www.abinit.org/documentation/helpfiles/for-v6.6/input_variables/varpar.html for a list and follow the tutorial at http://www.abinit.org/documentation/helpfiles/for-v6.6/tutorial/lesson_parallelism.html
Whatever scheme - k-point, band, ... - you choose, you have to launch Abinit with coherent parameters and, of course, this is related to the parallelization library/environment you linked Abinit with when compiling. With openMPI, the launcher command will look similar to "mpirun -np X abinit ...", with MPICH2 to "mpiexec -n X abinit ...", with POE (a classic on Blue Gene) "poe abinit -procs X ..." where X is the number of parallel processes and ... extra parameters relevant to your own cluster. I cannot guess what's running on your system and how you configured Abinit but I'm pretty sure that the BG admins can provide extensive information on how to submit batch jobs on your Blue Gene system (and reserve a certain number of nodes) and how to call the parallel launcher within them.
The logic is 1) setup parallel variables in the Abinit input file 2) define the corresponding X in the parallel launcher command (probably poe on BG) - set 1) and 2) maybe with the help of paral_kgb as we discussed before 3) write a batch submission file to reserve enough BG nodes and to invoke the mpi launcher ... and then submit it to the queue (probably managed by LoadLeveler on BG).

Kind regards,

Alain

xiangpisai
Posts: 19
Joined: Thu Mar 24, 2011 3:02 pm

Re: running abinit on Blue Gene

Post by xiangpisai » Thu Mar 31, 2011 6:08 pm

Boris wrote:Hi

I'm not sure I understand your question, but abinit uses the number of cpus that you are providing in the script file when you submit your job on BG.

For instance, when submitting your job on BG, if you specify 4 nodes, with 8 processors on each node, then abinit will use 32 procs.

Did I answer your question?

Boris

Dear Boris,
I know there are 32 procs if I do what you said. But the problem is, the system's minimum allowed nodes is 512, which is way too many for me. I want know if there is a way that I can get 512 nodes specified to me, and control that I only want 32 for abinit and waste all the rest. But still, your answer is helpful. Thank you very much!
Best Regards
xiangpisai

xiangpisai
Posts: 19
Joined: Thu Mar 24, 2011 3:02 pm

Re: running abinit on Blue Gene

Post by xiangpisai » Thu Mar 31, 2011 6:10 pm

Alain_Jacques wrote:Hello xiangsipai,

As Boris said, there are variables in Abinit that control how the different parallelization schemes work; look in http://www.abinit.org/documentation/helpfiles/for-v6.6/input_variables/varpar.html for a list and follow the tutorial at http://www.abinit.org/documentation/helpfiles/for-v6.6/tutorial/lesson_parallelism.html
Whatever scheme - k-point, band, ... - you choose, you have to launch Abinit with coherent parameters and, of course, this is related to the parallelization library/environment you linked Abinit with when compiling. With openMPI, the launcher command will look similar to "mpirun -np X abinit ...", with MPICH2 to "mpiexec -n X abinit ...", with POE (a classic on Blue Gene) "poe abinit -procs X ..." where X is the number of parallel processes and ... extra parameters relevant to your own cluster. I cannot guess what's running on your system and how you configured Abinit but I'm pretty sure that the BG admins can provide extensive information on how to submit batch jobs on your Blue Gene system (and reserve a certain number of nodes) and how to call the parallel launcher within them.
The logic is 1) setup parallel variables in the Abinit input file 2) define the corresponding X in the parallel launcher command (probably poe on BG) - set 1) and 2) maybe with the help of paral_kgb as we discussed before 3) write a batch submission file to reserve enough BG nodes and to invoke the mpi launcher ... and then submit it to the queue (probably managed by LoadLeveler on BG).

Kind regards,

Alain

Dear Alain,
Thanks so much for your kind help. I failed to figure out what MPI the BG is using. We use slurm to queue the jobs, and the submitting script reads: mpirun -mode VN -cwd `pwd` my/path/abinit-6.6.1/abinit <1.files>& log. It seems to be an openmpi, so I tried your option -np 16. the job is still in the queue, I will update again if I got any result. Thank you.

User avatar
pouillon
Posts: 651
Joined: Wed Aug 19, 2009 10:08 am
Location: Spain
Contact:

Re: running abinit on Blue Gene

Post by pouillon » Thu Mar 31, 2011 7:14 pm

To know which implementation of MPI you have, the following command might help:

Code: Select all

mpif90 -show

The option can also be sometimes "-showme".
Yann Pouillon
Simune Atomistics
Donostia-San Sebastián, Spain

xiangpisai
Posts: 19
Joined: Thu Mar 24, 2011 3:02 pm

Re: running abinit on Blue Gene

Post by xiangpisai » Thu Mar 31, 2011 7:37 pm

pouillon wrote:To know which implementation of MPI you have, the following command might help:

Code: Select all

mpif90 -show

The option can also be sometimes "-showme".

Thank you for your kind help. On the BG I use, there is no mpif90 command. I use mpixlf90 instead and it pop-up a very long path, saying /opt/ibmcmp/xlf/bg/10.1/bin/blrts_xlf90 with a lot of .rts files. I cannot find anything like openmpi or so. But, my submitting script uses mpirun.

User avatar
Alain_Jacques
Posts: 279
Joined: Sat Aug 15, 2009 9:34 pm
Location: Université catholique de Louvain - Belgium

Re: running abinit on Blue Gene

Post by Alain_Jacques » Fri Apr 01, 2011 10:11 am

If I understand, your jobs reserve 512 cores, only use a small fraction ... and are probably buried forever in the slurm queue. Both MPI parameters and slurm commands have to be adjusted accordingly. As discussed before MPI uses -n xxx or -np xxx or -procs parameters depending on the variant - you have to figure out which one your cluster uses. And you probably have to add a "#SBATCH --ntasks=xxx" line in your slurm submission jobs; xxx is the number of parallel tasks and the number of reserved cores - should be the same although this can be modified by the cluster setup (BG cpus are somewhat exotic). You should probably also adjust slurm "#SBATCH --mem-per-cpu=" and "#SBATCH --time=" commands

Alain

Locked