Page 1 of 1
running abinit on Blue Gene
Posted: Wed Mar 30, 2011 6:26 pm
by xiangpisai
Hi everyone,
I just compiled an abinit on Blue Gene. But the minimum nodes allowed on bluegene is 32 cpus, which is way too many for some small calculations. What's more, I found I cannot run tgw1_3.in located in the tutorial folder. I found it can only run at one single cpu. So is there any way to specify how many cpus abinit uses? I mean, if the system gave me, like, 512 cpus, can I reduce them, by using abinit command, to, say, 32 cpus? Thank you very much for your help!
Sincerely,
xiangpisai
Re: running abinit on Blue Gene
Posted: Wed Mar 30, 2011 6:41 pm
by Boris
Hi
I'm not sure I understand your question, but abinit uses the number of cpus that you are providing in the script file when you submit your job on BG.
For instance, when submitting your job on BG, if you specify 4 nodes, with 8 processors on each node, then abinit will use 32 procs.
Did I answer your question?
Boris
Re: running abinit on Blue Gene
Posted: Wed Mar 30, 2011 11:58 pm
by Alain_Jacques
Hello xiangsipai,
As Boris said, there are variables in Abinit that control how the different parallelization schemes work; look in
http://www.abinit.org/documentation/helpfiles/for-v6.6/input_variables/varpar.html for a list and follow the tutorial at
http://www.abinit.org/documentation/helpfiles/for-v6.6/tutorial/lesson_parallelism.htmlWhatever scheme - k-point, band, ... - you choose, you have to launch Abinit with coherent parameters and, of course, this is related to the parallelization library/environment you linked Abinit with when compiling. With openMPI, the launcher command will look similar to "mpirun -np X abinit ...", with MPICH2 to "mpiexec -n X abinit ...", with POE (a classic on Blue Gene) "poe abinit -procs X ..." where X is the number of parallel processes and ... extra parameters relevant to your own cluster. I cannot guess what's running on your system and how you configured Abinit but I'm pretty sure that the BG admins can provide extensive information on how to submit batch jobs on your Blue Gene system (and reserve a certain number of nodes) and how to call the parallel launcher within them.
The logic is 1) setup parallel variables in the Abinit input file 2) define the corresponding X in the parallel launcher command (probably poe on BG) - set 1) and 2) maybe with the help of paral_kgb as we discussed before 3) write a batch submission file to reserve enough BG nodes and to invoke the mpi launcher ... and then submit it to the queue (probably managed by LoadLeveler on BG).
Kind regards,
Alain
Re: running abinit on Blue Gene
Posted: Thu Mar 31, 2011 6:08 pm
by xiangpisai
Boris wrote:Hi
I'm not sure I understand your question, but abinit uses the number of cpus that you are providing in the script file when you submit your job on BG.
For instance, when submitting your job on BG, if you specify 4 nodes, with 8 processors on each node, then abinit will use 32 procs.
Did I answer your question?
Boris
Dear Boris,
I know there are 32 procs if I do what you said. But the problem is, the system's minimum allowed nodes is 512, which is way too many for me. I want know if there is a way that I can get 512 nodes specified to me, and control that I only want 32 for abinit and waste all the rest. But still, your answer is helpful. Thank you very much!
Best Regards
xiangpisai
Re: running abinit on Blue Gene
Posted: Thu Mar 31, 2011 6:10 pm
by xiangpisai
Alain_Jacques wrote:Hello xiangsipai,
As Boris said, there are variables in Abinit that control how the different parallelization schemes work; look in
http://www.abinit.org/documentation/helpfiles/for-v6.6/input_variables/varpar.html for a list and follow the tutorial at
http://www.abinit.org/documentation/helpfiles/for-v6.6/tutorial/lesson_parallelism.htmlWhatever scheme - k-point, band, ... - you choose, you have to launch Abinit with coherent parameters and, of course, this is related to the parallelization library/environment you linked Abinit with when compiling. With openMPI, the launcher command will look similar to "mpirun -np X abinit ...", with MPICH2 to "mpiexec -n X abinit ...", with POE (a classic on Blue Gene) "poe abinit -procs X ..." where X is the number of parallel processes and ... extra parameters relevant to your own cluster. I cannot guess what's running on your system and how you configured Abinit but I'm pretty sure that the BG admins can provide extensive information on how to submit batch jobs on your Blue Gene system (and reserve a certain number of nodes) and how to call the parallel launcher within them.
The logic is 1) setup parallel variables in the Abinit input file 2) define the corresponding X in the parallel launcher command (probably poe on BG) - set 1) and 2) maybe with the help of paral_kgb as we discussed before 3) write a batch submission file to reserve enough BG nodes and to invoke the mpi launcher ... and then submit it to the queue (probably managed by LoadLeveler on BG).
Kind regards,
Alain
Dear Alain,
Thanks so much for your kind help. I failed to figure out what MPI the BG is using. We use slurm to queue the jobs, and the submitting script reads: mpirun -mode VN -cwd `pwd` my/path/abinit-6.6.1/abinit <1.files>& log. It seems to be an openmpi, so I tried your option -np 16. the job is still in the queue, I will update again if I got any result. Thank you.
Re: running abinit on Blue Gene
Posted: Thu Mar 31, 2011 7:14 pm
by pouillon
To know which implementation of MPI you have, the following command might help:
The option can also be sometimes "-showme".
Re: running abinit on Blue Gene
Posted: Thu Mar 31, 2011 7:37 pm
by xiangpisai
pouillon wrote:To know which implementation of MPI you have, the following command might help:
The option can also be sometimes "-showme".
Thank you for your kind help. On the BG I use, there is no mpif90 command. I use mpixlf90 instead and it pop-up a very long path, saying /opt/ibmcmp/xlf/bg/10.1/bin/blrts_xlf90 with a lot of .rts files. I cannot find anything like openmpi or so. But, my submitting script uses mpirun.
Re: running abinit on Blue Gene
Posted: Fri Apr 01, 2011 10:11 am
by Alain_Jacques
If I understand, your jobs reserve 512 cores, only use a small fraction ... and are probably buried forever in the slurm queue. Both MPI parameters and slurm commands have to be adjusted accordingly. As discussed before MPI uses -n xxx or -np xxx or -procs parameters depending on the variant - you have to figure out which one your cluster uses. And you probably have to add a "#SBATCH --ntasks=xxx" line in your slurm submission jobs; xxx is the number of parallel tasks and the number of reserved cores - should be the same although this can be modified by the cluster setup (BG cpus are somewhat exotic). You should probably also adjust slurm "#SBATCH --mem-per-cpu=" and "#SBATCH --time=" commands
Alain