MPI Berry Phase Tutorial Crash

Phonons, DFPT, electron-phonon, electric-field response, mechanical response…

Moderators: mverstra, joaocarloscabreu

Locked
paulfons
Posts: 16
Joined: Tue Apr 12, 2011 10:10 am

MPI Berry Phase Tutorial Crash

Post by paulfons » Thu Aug 30, 2012 7:49 am

I first ran the Berry phase tutorial file tffield_1.in though my version of abinit (12.6.3) without incident. When I introduced the following three lines to parallelize over the k-points using mpi, the program crashed in a manner similar to my earlier posting. The parallelization conditions were determined by running paral_kgb -32 and running with 32 processors. I do not see what I am doing wrong. Is there a crashing bug in the Berry phase routine? Does anyone have any insight as to what I could do to get it to work?


#Parallization
#***********************
paral_kgb 1
npkpt 16
npspinor 2


The code crashed in a manner to my earlier posting.




Simple Lattice Grid
initberry: for direction 1, nkstr = 6, nstr = 144
initberry: for direction 2, nkstr = 6, nstr = 144
initberry: for direction 3, nkstr = 6, nstr = 144
*** glibc detected *** /opt/abinit/bin/abinit: free(): invalid pointer: 0x0000000005487240 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x733b6)[0x2b4825c013b6]
/lib64/libc.so.6(cfree+0x6c)[0x2b4825c062dc]
/opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifcore.so.5(for_deallocate+0xb9)[0x2b4824463879]
/opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifcore.so.5(for_dealloc_allocatable+0x70)[0x2b4824463780]
/opt/abinit/bin/abinit[0x1403220]
/opt/abinit/bin/abinit[0xd01f30]
/opt/abinit/bin/abinit[0x4e7c5e]
/opt/abinit/bin/abinit[0x41ecee]
/opt/abinit/bin/abinit[0x4101f8]
/opt/abinit/bin/abinit[0x40853a]
/opt/abinit/bin/abinit[0x4070dc]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2b4825bacbfd]
/opt/abinit/bin/abinit[0x406fd9]
======= Memory map: ========
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
abinit 000000000140301E Unknown Unknown Unknown
abinit 0000000000D01F30 Unknown Unknown Unknown
abinit 00000000004E7C5E Unknown Unknown Unknown
abinit 000000000041ECEE Unknown Unknown Unknown
abinit 00000000004101F8 Unknown Unknown Unknown
abinit 000000000040853A Unknown Unknown Unknown
abinit 00000000004070DC Unknown Unknown Unknown
libc.so.6 00002ACF0A881BFD Unknown Unknown Unknown
abinit 0000000000406FD9 Unknown Unknown Unknown
00400000-022a5000 r-xp 00000000 08:01 1580141 /opt/abinit/bin/abinit
024a4000-024a5000 r--p 01ea4000 08:01 1580141 /opt/abinit/bin/abinit
024a5000-02620000 rw-p 01ea5000 08:01 1580141 /opt/abinit/bin/abinit
02620000-05602000 rw-p 00000000 00:00 0 [heap]
2b4821533000-2b4821551000 r-xp 00000000 08:01 653270 /lib64/ld-2.11.3.so
2b4821551000-2b4821552000 rw-p 00000000 00:00 0
2b4821568000-2b4821569000 rw-p 00000000 00:00 0
2b4821750000-2b4821751000 r--p 0001d000 08:01 653270 /lib64/ld-2.11.3.so
2b4821751000-2b4821752000 rw-p 0001e000 08:01 653270 /lib64/ld-2.11.3.so
2b4821752000-2b4821753000 rw-p 00000000 00:00 0
2b4821753000-2b4821755000 r-xp 00000000 08:01 659377 /lib64/libdl-2.11.3.so
2b4821755000-2b4821955000 ---p 00002000 08:01 659377 /lib64/libdl-2.11.3.so
2b4821955000-2b4821956000 r--p 00002000 08:01 659377 /lib64/libdl-2.11.3.so
2b4821956000-2b4821957000 rw-p 00003000 08:01 659377 /lib64/libdl-2.11.3.so
2b4821957000-2b4821f29000 r-xp 00000000 08:01 1594785 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_intel_lp64.so
2b4821f29000-2b4822129000 ---p 005d2000 08:01 1594785 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_intel_lp64.so
2b4822129000-2b4822138000 rw-p 005d2000 08:01 1594785 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_intel_lp64.so
2b4822138000-2b482213d000 rw-p 00000000 00:00 0
2b482213d000-2b4822610000 r-xp 00000000 08:01 1594795 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_sequential.so
2b4822610000-2b482280f000 ---p 004d3000 08:01 1594795 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_sequential.so
2b482280f000-2b482281b000 rw-p 004d2000 08:01 1594795 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_sequential.so
2b482281b000-2b482281c000 rw-p 00000000 00:00 0
2b482281c000-2b4823664000 r-xp 00000000 08:01 1594779 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_core.so
2b4823664000-2b4823863000 ---p 00e48000 08:01 1594779 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_core.so
2b4823863000-2b4823877000 rw-p 00e47000 08:01 1594779 /opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/libmkl_core.so
2b4823877000-2b4823887000 rw-p 00000000 00:00 0
2b4823887000-2b482388f000 r-xp 00000000 08:01 659403 /lib64/librt-2.11.3.so
2b482388f000-2b4823a8e000 ---p 00008000 08:01 659403 /lib64/librt-2.11.3.so
2b4823a8e000-2b4823a8f000 r--p 00007000 08:01 659403 /lib64/librt-2.11.3.so
2b4823a8f000-2b4823a90000 rw-p 00008000 08:01 659403 /lib64/librt-2.11.3.so
2b4823a90000-2b4823df4000 r-xp 00000000 08:01 1579710 /opt/intel/impi/4.0.3.008/intel64/lib/libmpi.so.4.0
2b4823df4000-2b4823ef3000 ---p 00364000 08:01 1579710 /opt/intel/impi/4.0.3.008/intel64/lib/libmpi.so.4.0
2b4823ef3000-2b4823f10000 rw-p 00363000 08:01 1579710 /opt/intel/impi/4.0.3.008/intel64/lib/libmpi.so.4.0
2b4823f10000-2b4823f5b000 rw-p 00000000 00:00 0
2b4823f5b000-2b4823f88000 r-xp 00000000 08:01 1579745 /opt/intel/impi/4.0.3.008/intel64/lib/libmpigf.so.4.0
2b4823f88000-2b4824088000 ---p 0002d000 08:01 1579745 /opt/intel/impi/4.0.3.008/intel64/lib/libmpigf.so.4.0
2b4824088000-2b4824089000 rw-p 0002d000 08:01 1579745 /opt/intel/impi/4.0.3.008/intel64/lib/libmpigf.so.4.0
2b4824089000-2b48240a0000 r-xp 00000000 08:01 652925 /lib64/libpthread-2.11.3.so
2b48240a0000-2b48242a0000 ---p 00017000 08:01 652925 /lib64/libpthread-2.11.3.so
2b48242a0000-2b48242a1000 r--p 00017000 08:01 652925 /lib64/libpthread-2.11.3.so
2b48242a1000-2b48242a2000 rw-p 00018000 08:01 652925 /lib64/libpthread-2.11.3.so
2b48242a2000-2b48242a6000 rw-p 00000000 00:00 0
2b48242a6000-2b48242d3000 r-xp 00000000 08:01 1576766 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifport.so.5
2b48242d3000-2b48243d2000 ---p 0002d000 08:01 1576766 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifport.so.5
2b48243d2000-2b48243d5000 rw-p 0002c000 08:01 1576766 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifport.so.5
2b48243d5000-2b48243dc000 rw-p 00000000 00:00 0
2b48243dc000-2b4824508000 r-xp 00000000 08:01 1576758 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifcore.so.5
2b4824508000-2b4824608000 ---p 0012c000 08:01 1576758 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifcore.so.5
2b4824608000-2b4824619000 rw-p 0012c000 08:01 1576758 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libifcore.so.5
2b4824619000-2b4824620000 rw-p 00000000 00:00 0
2b4824620000-2b48248a8000 r-xp 00000000 08:01 1566862 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libimf.so
2b48248a8000-2b48249a8000 ---p 00288000 08:01 1566862 /opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/libimf.soforrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
abinit 000000000140301E Unknown Unknown Unknown
abinit 0000000000D01F30 Unknown Unknown Unknown
abinit 00000000004E7C5E Unknown Unknown Unknown
abinit 000000000041ECEE Unknown Unknown Unknown
abinit 00000000004101F8 Unknown Unknown Unknown
abinit 000000000040853A Unknown Unknown Unknown
abinit 00000000004070DC Unknown Unknown Unknown
libc.so.6 00002B8F99733BFD Unknown Unknown Unknown
abinit 0000000000406FD9 Unknown Unknown Unknown
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
paulfons@asccmp177:~/AbinitJobs/AlAs>

User avatar
jzwanzig
Posts: 504
Joined: Mon Aug 17, 2009 9:25 am

Re: MPI Berry Phase Tutorial Crash

Post by jzwanzig » Mon Sep 10, 2012 7:05 pm

Hi again, as noted in my response to another of your posts, nspinor 2 doesn't really work in 6.12.3 for berry's phase, it works in 7.0.1. Furthermore, I am not sure that paral_kgb parallelization has been fully tested for the berry's phase code (this is parallelization over k points, bands, fft planes, spinors). Simpler parallelization over only k points has been tested; it is invoked automatically when you run on more than one processor, but you do not use paral_kgb to get it.
Josef W. Zwanziger
Professor, Department of Chemistry
Canada Research Chair in NMR Studies of Materials
Dalhousie University
Halifax, NS B3H 4J3 Canada
jzwanzig@gmail.com

Locked