Dear:
When I performed the test/t41.file,there exits error as follows:
*** glibc detected *** free(): invalid pointer: 0x0000002a99d35010 ***
p0_15043: p4_error: interrupt SIGx: 6
forrtl: error (69): process interrupted (SIGINT)
rm_l_1_29792: (2.921875) net_send: could not write to fd=5, errno = 32
forrtl: error (69): process interrupted (SIGINT)
p0_15043: (5.964844) net_send: could not write to fd=4, errno = 32
What is the problem?
SDwang
Problem with t41.file test
Moderator: bguster
Re: Problem with t41.file test
1) read the nettiquette in viewtopic.php?f=20&t=251
2) this is probably linked to your build of abinit and in particular p4_error probably means a parallelization error.
You really can't expect us to explain your crash on so little information...
Matthieu
2) this is probably linked to your build of abinit and in particular p4_error probably means a parallelization error.
You really can't expect us to explain your crash on so little information...
Matthieu
Matthieu Verstraete
University of Liege, Belgium
University of Liege, Belgium
killed paralell run
I have tested paralell calculation in ./tests/tparal_1.in, but it stops at:
================================================================================
getcut: wavevector= 0.0000 0.0000 0.0000 ngfft= 36 36 36
ecut(hartree)= 30.000 => boxcut(ratio)= 2.06487
scfcv : before setvtr, energies%e_hartree= 0.000000000000000E+000
ewald : nr and ng are 3 and 11
mklocl_recipspace : will add potential with strength vprtrb(:)=
0.000000000000000E+000 0.000000000000000E+000
setvtr : istep,n1xccc,moved_rhor= 1 0 0
scfcv : after setvtr, energies%e_hartree= 0.000000000000000E+000
ITER STEP NUMBER 1
vtorho : nnsclo_now= 2, note that nnsclo,dbl_nnsclo,istep= 0 0 1
p1_32116: p4_error: interrupt SIGSEGV: 11
p0_32111: p4_error: interrupt SIGSEGV: 11
forrtl: error (69): process interrupted (SIGINT)
rm_l_1_32174: (2.261719) net_send: could not write to fd=5, errno = 32
p1_32116: (2.261719) net_send: could not write to fd=5, errno = 32
p0_32111: (4.523438) net_send: could not write to fd=4, errno = 32
I do not kown why? In Below is part of my log file.
=== Build Information ===
Version : 6.2.2
Build target : x86_64_linux_intel9.0
Build date : 20101012
=== Compiler Suite ===
C compiler : gnu3.4
CFLAGS : -g -O3 -fschedule-insns2 -march=nocona -mmmx -msse -msse2 -msse3 -mfpmath=sse
C++ compiler : gnu3.4
CXXFLAGS : -g -O3 -fschedule-insns2 -march=nocona -mmmx -msse -msse2 -msse3 -mfpmath=sse
Fortran compiler : intel9.0
FCFLAGS : -g -extend-source -vec-report0
FC_LDFLAGS : -static-libgcc -static-intel
=== Optimizations ===
Debug level : yes
Optimization level : standard
Architecture : intel_xeon
=== MPI ===
Parallel build : yes
Parallel I/O : yes
=== Linear algebra ===
Library flavor : @linalg_flavor@
Use ScaLAPACK : no
=== Plug-ins ===
BigDFT : no
ETSF I/O : no
LibXC : no
FoX : no
NetCDF : no
Wannier90 : no
=== Experimental features ===
Bindings : no
Exports : no
GW double-precision : no
Macroave build : yes
================================================================================
getcut: wavevector= 0.0000 0.0000 0.0000 ngfft= 36 36 36
ecut(hartree)= 30.000 => boxcut(ratio)= 2.06487
scfcv : before setvtr, energies%e_hartree= 0.000000000000000E+000
ewald : nr and ng are 3 and 11
mklocl_recipspace : will add potential with strength vprtrb(:)=
0.000000000000000E+000 0.000000000000000E+000
setvtr : istep,n1xccc,moved_rhor= 1 0 0
scfcv : after setvtr, energies%e_hartree= 0.000000000000000E+000
ITER STEP NUMBER 1
vtorho : nnsclo_now= 2, note that nnsclo,dbl_nnsclo,istep= 0 0 1
p1_32116: p4_error: interrupt SIGSEGV: 11
p0_32111: p4_error: interrupt SIGSEGV: 11
forrtl: error (69): process interrupted (SIGINT)
rm_l_1_32174: (2.261719) net_send: could not write to fd=5, errno = 32
p1_32116: (2.261719) net_send: could not write to fd=5, errno = 32
p0_32111: (4.523438) net_send: could not write to fd=4, errno = 32
I do not kown why? In Below is part of my log file.
=== Build Information ===
Version : 6.2.2
Build target : x86_64_linux_intel9.0
Build date : 20101012
=== Compiler Suite ===
C compiler : gnu3.4
CFLAGS : -g -O3 -fschedule-insns2 -march=nocona -mmmx -msse -msse2 -msse3 -mfpmath=sse
C++ compiler : gnu3.4
CXXFLAGS : -g -O3 -fschedule-insns2 -march=nocona -mmmx -msse -msse2 -msse3 -mfpmath=sse
Fortran compiler : intel9.0
FCFLAGS : -g -extend-source -vec-report0
FC_LDFLAGS : -static-libgcc -static-intel
=== Optimizations ===
Debug level : yes
Optimization level : standard
Architecture : intel_xeon
=== MPI ===
Parallel build : yes
Parallel I/O : yes
=== Linear algebra ===
Library flavor : @linalg_flavor@
Use ScaLAPACK : no
=== Plug-ins ===
BigDFT : no
ETSF I/O : no
LibXC : no
FoX : no
NetCDF : no
Wannier90 : no
=== Experimental features ===
Bindings : no
Exports : no
GW double-precision : no
Macroave build : yes
Re: Problem with t41.file test
your code is segfaulting, but there's no way to tell why at this distance. These input files have run on dozens of reference architectures every night for years, so the problem is with your build, hardware, or you have modified the input file. Your compilers are quite old, but this should not be the problem.
- Check your parallel mpif90/mpicc is correctly compiled with the same versions of the compilers.
- Compile without optimizations or (first) run under a debugger:
* read the gdb manual or a howto
* mpirun -np 4 abinit < etc.etc.etc. > &
* top gives you the pid for the instances of abinit, then you can run
* gdb $ABINITPATH/abinit <pid1>
* inside gdb, type cont to continue execution, and see where it crashes.
Also, does it run sequentially?
matthieu
- Check your parallel mpif90/mpicc is correctly compiled with the same versions of the compilers.
- Compile without optimizations or (first) run under a debugger:
* read the gdb manual or a howto
* mpirun -np 4 abinit < etc.etc.etc. > &
* top gives you the pid for the instances of abinit, then you can run
* gdb $ABINITPATH/abinit <pid1>
* inside gdb, type cont to continue execution, and see where it crashes.
Also, does it run sequentially?
matthieu
Matthieu Verstraete
University of Liege, Belgium
University of Liege, Belgium
Re: Problem with t41.file test
Hi,
I am running into similar error and I am not able to figure out why my jobs are crashing. Any help will be greatly appreciated.
Requested basis set is non-standard
Compound shells will be simplified
There are 30 shells and 82 basis functions
A cutoff of 1.0D-12 yielded 442 shell pairs
There are 3388 function pairs ( 4202 Cartesian)
Smallest overlap matrix eigenvalue = 4.51E-03
p0_947: p4_error: interrupt SIGSEGV: 11
Below is how my qchem input looks like:
$molecule
0 5
S
Fe 1 2.030996
$end
$rem
BASIS gen
ECP gen
EXCHANGE PBE
CORRELATION PBE
MAX_SCF_CYCLES 200
SCF_ALGORITHM DIIS_GDM
INCDFT FALSE
VARTHRESH FALSE
SYMMETRY FALSE
JOBTYPE freq
MEM_TOTAL = 4000
MEM_STATIC = 256
$end
I am running into similar error and I am not able to figure out why my jobs are crashing. Any help will be greatly appreciated.
Requested basis set is non-standard
Compound shells will be simplified
There are 30 shells and 82 basis functions
A cutoff of 1.0D-12 yielded 442 shell pairs
There are 3388 function pairs ( 4202 Cartesian)
Smallest overlap matrix eigenvalue = 4.51E-03
p0_947: p4_error: interrupt SIGSEGV: 11
Below is how my qchem input looks like:
$molecule
0 5
S
Fe 1 2.030996
$end
$rem
BASIS gen
ECP gen
EXCHANGE PBE
CORRELATION PBE
MAX_SCF_CYCLES 200
SCF_ALGORITHM DIIS_GDM
INCDFT FALSE
VARTHRESH FALSE
SYMMETRY FALSE
JOBTYPE freq
MEM_TOTAL = 4000
MEM_STATIC = 256
$end