Memory Issue When Running Abinit In Parallel

Total energy, geometry optimization, DFT+U, spin....

Moderator: bguster

Locked
xmilchmann
Posts: 2
Joined: Mon Jun 20, 2011 11:56 pm

Memory Issue When Running Abinit In Parallel

Post by xmilchmann » Wed Jun 22, 2011 12:45 am

Hello,

I'm attempting to calculate the total energy of a single aluminum atom using the parallel version of abinit (abinip). I've done the calculation with the serial version of abinit thus far, and determined that running the calculation with a cell size of 80x80x80 and an energy cutoff (ecut) of 40.0 is necessary to get a properly converged value.

Where my problem lies is when I try to run the calculation using these parameters and the parallel version of abinit on my cluster. When I try to do this, I either get the error "forrtl: severe (41): insufficient virtual memory" in my log file and the job terminates, or my cluster terminates the job due to excessive memory usage (resource violation). I've tried increasing my memory allotment when I run the job (setting pmem=4000mb in my PBS script), and increasing the number of processors (32 tried last), but to no avail. Bear in mind that ~4gb of memory for each of 32 processors gives me a total of ~128gb of memory... and i'm still running out. The email I get when the job is terminated looks like this:

Execution terminated
Exit_status=41
resources_used.cput=00:31:59
resources_used.mem=146545824kb
resources_used.vmem=282983892kb
resources_used.walltime=00:02:42

That's over 280gb of virtual memory used!!! This confuses me as the calculation was able to be run in serial on the head node of my cluster without issue (this has much less memory). Why is it consuming so much memory in parallel? I've done other calculations using the parallel version of abinit, so I know that the software works.

Eventually, I will need to calculate the energy of a CLUSTER of aluminum atoms as well, so I foresee myself needing an even larger cell size/cutoff value to get the correct results. Is this a calculation that is able to be run in a parallel environment, or is it beyond the capabilities of abinit and current hardware? What am I missing?

I've attached my input file for examination. Any help more experienced abinit users could offer would be greatly appreciated.

Thanks,
Mike
Attachments
Alsingle.in
(1.15 KiB) Downloaded 230 times

Locked