Best parallelism for 1st order response calculations.
Posted: Fri Sep 25, 2020 4:33 am
Hi everyone,
I am planning to do some parallel response function calculations (for the phonon dispersions) on a large(ish) system of 80 atoms (352 bands) a 5x3x2 K-point grid (30kpts when kptopt = 3) and a 25Ha cutoff using PAWs and the LDA.
The most time consuming part of this calculation is the finite q part. i.e:
getwfk 1 # Use GS wave functions from dataset1
kptopt 3 # Need full k-point set for finite-Q response
rfphon 1 # Do phonon response
rfatpol 1 80 # Treat displacements of all atoms
rfdir 1 1 1 # Do all directions (symmetry will be used)
tolvrs 1.0d-12 # This default is active for sets 3-10
I want to work out how to efficiently distribute the load for this over k-points, the fft grid and bands. Problem is, paral_kgb doesn't work here, and, setting any value (other than 1) for npfft or npband sets it back to 1 when the calculation starts, i.e,
--- !WARNING
src_file: m_mpi_setup.F90
src_line: 267
message: |
For non ground state calculation, set bandpp, npfft, npband, npspinor npkpt and nphf to 1
...
I have read that only K-pt parallelisation works here, however, the abinit website (https://docs.abinit.org/topics/parallelism/) reports otherwise, saying:
For response calculations, the code has been parallelized (MPI-based parallelism) on k-points, spins, bands, as well as on perturbations. For the k-points, spins and bands parallelisation, the communication load is rather small also, and, unlike for the GS calculations, the number of nodes that can be used in parallel will be large, nearly independently of the physics of the problem. Parallelism on perturbations is very similar to the parallelism on images in the ground state case (so, very efficient), although the load balancing problem for perturbations with different number of k points is not adressed at present. Use of MPIIO is mandatory for the largest speed ups to be observed.
I then have two questions:
1) How does parallelism work in a phonon calculation
2) How do I best set the number of processors for such a calculation (according to the number of bands, kpts and the fft grid?)
3) Does the hybrid MPI/openMP parallelisation help for RF calculations? I ask since it doesn't mention this on the website (from what I have found, at least).
Best,
Jack
I am planning to do some parallel response function calculations (for the phonon dispersions) on a large(ish) system of 80 atoms (352 bands) a 5x3x2 K-point grid (30kpts when kptopt = 3) and a 25Ha cutoff using PAWs and the LDA.
The most time consuming part of this calculation is the finite q part. i.e:
getwfk 1 # Use GS wave functions from dataset1
kptopt 3 # Need full k-point set for finite-Q response
rfphon 1 # Do phonon response
rfatpol 1 80 # Treat displacements of all atoms
rfdir 1 1 1 # Do all directions (symmetry will be used)
tolvrs 1.0d-12 # This default is active for sets 3-10
I want to work out how to efficiently distribute the load for this over k-points, the fft grid and bands. Problem is, paral_kgb doesn't work here, and, setting any value (other than 1) for npfft or npband sets it back to 1 when the calculation starts, i.e,
--- !WARNING
src_file: m_mpi_setup.F90
src_line: 267
message: |
For non ground state calculation, set bandpp, npfft, npband, npspinor npkpt and nphf to 1
...
I have read that only K-pt parallelisation works here, however, the abinit website (https://docs.abinit.org/topics/parallelism/) reports otherwise, saying:
For response calculations, the code has been parallelized (MPI-based parallelism) on k-points, spins, bands, as well as on perturbations. For the k-points, spins and bands parallelisation, the communication load is rather small also, and, unlike for the GS calculations, the number of nodes that can be used in parallel will be large, nearly independently of the physics of the problem. Parallelism on perturbations is very similar to the parallelism on images in the ground state case (so, very efficient), although the load balancing problem for perturbations with different number of k points is not adressed at present. Use of MPIIO is mandatory for the largest speed ups to be observed.
I then have two questions:
1) How does parallelism work in a phonon calculation
2) How do I best set the number of processors for such a calculation (according to the number of bands, kpts and the fft grid?)
3) Does the hybrid MPI/openMP parallelisation help for RF calculations? I ask since it doesn't mention this on the website (from what I have found, at least).
Best,
Jack