ACML or ACML_MP ?
Moderators: fgoudreault, mcote
Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
ACML or ACML_MP ?
Is there any difference between using ACML or ACML MP?
The reason I am asking is if I start 4 MPI jobs on a machine with 4 cores, if I use acml mp, will each mpi job make 4 threads (totaling up to 16?)
Thanks,
Evren
The reason I am asking is if I start 4 MPI jobs on a machine with 4 cores, if I use acml mp, will each mpi job make 4 threads (totaling up to 16?)
Thanks,
Evren
- Alain_Jacques
- Posts: 279
- Joined: Sat Aug 15, 2009 9:34 pm
- Location: Université catholique de Louvain - Belgium
Re: ACML or ACML_MP ?
Dear Evren,
You may have performance enhancements with ACML_MP library thanks to the multithreading of some of its functions - on large linalg systems and assuming you have available cores. Performance comparisons are somewhat complicated by the fact that recent multicores processors automatically tune the core frequencies depending on the number of cores that are busy.
Anyway Abinit and ACML_MP use different parallelization techniques - Abinit uses a MPI and OpenMP mixture, ACML_MP is OpenMP only so it's possible to overload the cores. Up to you - and your specific study (many k points? many bands? ...) to find the right balance between the two schemes. Furthermore, the number of threads opened by OpenMP routines can be adjusted with the OMP_NUM_THREADS environment variable so you can compile parallel Abinit and decide at runtime.
Kind regards,
Alain
You may have performance enhancements with ACML_MP library thanks to the multithreading of some of its functions - on large linalg systems and assuming you have available cores. Performance comparisons are somewhat complicated by the fact that recent multicores processors automatically tune the core frequencies depending on the number of cores that are busy.
Anyway Abinit and ACML_MP use different parallelization techniques - Abinit uses a MPI and OpenMP mixture, ACML_MP is OpenMP only so it's possible to overload the cores. Up to you - and your specific study (many k points? many bands? ...) to find the right balance between the two schemes. Furthermore, the number of threads opened by OpenMP routines can be adjusted with the OMP_NUM_THREADS environment variable so you can compile parallel Abinit and decide at runtime.
Kind regards,
Alain
Re: ACML or ACML_MP ?
I understand that. I am building packages for a cluster so I wanted to make sure that the settings were not badly selected. It is difficult to teach people how to run their programs. It is diffult to make people set environment variables
As far as I understand, it would be best to use OMP_NUM_THREADS set to 1 when running MPI tasks on a cluster as long as each processor gets an MPI task on a node. However if the MPI would run 1 process per node then it is better to unset OMP_NUM_THREADS. Am I understanding correctly?
As far as I understand, it would be best to use OMP_NUM_THREADS set to 1 when running MPI tasks on a cluster as long as each processor gets an MPI task on a node. However if the MPI would run 1 process per node then it is better to unset OMP_NUM_THREADS. Am I understanding correctly?
Re: ACML or ACML_MP ?
It appears when ATLAS is configured to run with a specific number of threads, it ignores OMP_NUM_THREADS variable.
- Alain_Jacques
- Posts: 279
- Joined: Sat Aug 15, 2009 9:34 pm
- Location: Université catholique de Louvain - Belgium
Re: ACML or ACML_MP ?
Right. Most of the time I build a MPI abinit with sequential blas/lapack and a sequential abinit with multithreaded blas/lapack libraries. And I'm lucky enough to have small unit cells and many k points so my studies efficiently run with MPI parallelism. If the case is sequential, I give it a few cores to please blas/lapack ... but don't expect linear performance gain.
I also suggest to split the different parts of an input file - most of the time, the parallelization requirements are very different, no need to waste CPUs.
IMHO ACML performances are so so ... if you have some time to spare on benchmarks, I suggest OpenBLAS or even MKL on AMD CPUs.
Kind regards,
Alain
... and yes, ATLAS hardcodes the number of threads at compile time ... most Linux packages default to 2 threads - pretty arbitrary
I also suggest to split the different parts of an input file - most of the time, the parallelization requirements are very different, no need to waste CPUs.
IMHO ACML performances are so so ... if you have some time to spare on benchmarks, I suggest OpenBLAS or even MKL on AMD CPUs.
Kind regards,
Alain
... and yes, ATLAS hardcodes the number of threads at compile time ... most Linux packages default to 2 threads - pretty arbitrary
Re: ACML or ACML_MP ?
Actually, I sort of forgot that the thread I made was about ACML While ATLAS does not obey the openmp environment variables, ACML actually is able to run with 1 thread. The question now is if there is a performance penalty of running threaded acml with 1 thread, compared to serial acml I will let you know after running some tests, it is in my task queue