| 24小時(shí)熱門版塊排行榜 |
| 5 | 1/1 | 返回列表 |
| 查看: 3334 | 回復(fù): 5 | |||
| 當(dāng)前只顯示滿足指定條件的回帖,點(diǎn)擊這里查看本話題的所有回帖 | |||
04nylxb木蟲 (正式寫手)
|
[求助]
vasp跨節(jié)點(diǎn)運(yùn)行出錯(cuò),mpiexec_node-1 (handle_stdin_input 1089)
|
||
|
最近在集群上編譯帶CNEB的vasp5.2,并行vasp編譯成功,在單個(gè)節(jié)點(diǎn)(每個(gè)節(jié)點(diǎn)八核)上運(yùn)行 $ mpirun -np 8 vasp 時(shí)候,top下,發(fā)現(xiàn)確實(shí)出現(xiàn)八個(gè)vasp進(jìn)程。 但是,跨節(jié)點(diǎn)的時(shí)候,確出錯(cuò)了,出錯(cuò)信息如下: running on 15 nodes distr: one band on 1 nodes, 15 groups vasp.5.2.12 11Nov11 complex POSCAR found : 1 types and 2 ions ----------------------------------------------------------------------------- | | | W W AA RRRRR N N II N N GGGG !!! | | W W A A R R NN N II NN N G G !!! | | W W A A R R N N N II N N N G !!! | | W WW W AAAAAA RRRRR N N N II N N N G GGG ! | | WW WW A A R R N NN II N NN G G | | W W A A R R N N II N N GGGG !!! | | | | For optimal performance we recommend that you set | | NPAR = approx SQRT( number of cores) | | This will greatly improve the performance of VASP for DFT. | | The default NPAR=number of cores might be grossly inefficient | | on modern multi-core architectures or massively parallel machines. | | Unfortunately you need to use the default for hybrid, GW and RPA | | calculations. | | | ----------------------------------------------------------------------------- LDA part: xc-table for Pade appr. of Perdew found WAVECAR, reading the header number of bands has changed, file: 12 present: 15 trying to continue reading WAVECAR, but it might fail POSCAR, INCAR and KPOINTS ok, starting setup WARNING: small aliasing (wrap around) errors must be expected FFT: planning ...( 1 ) reading WAVECAR random initialization beyond band 13 the WAVECAR file was read sucessfully initial charge from wavefunction entering main loop N E dE d eps ncg rms rms(c) mpiexec_node-1 (handle_stdin_input 1089): stdin problem; if pgm is run in background, redirect from /dev/null mpiexec_node-1 (handle_stdin_input 1090): e.g.: mpiexec -n 4 a.out < /dev/null & rank 14 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 14: killed by signal 11 rank 13 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 13: killed by signal 9 rank 9 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 9: killed by signal 11 rank 8 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 8: killed by signal 11 rank 4 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 4: killed by signal 11 rank 3 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 3: killed by signal 9 rank 2 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 2: killed by signal 9 rank 1 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 1: killed by signal 11 rank 0 in job 14 node-1_49061 caused collective abort of all ranks 其中node-1是我的控制節(jié)點(diǎn)。進(jìn)程數(shù)為12以下的時(shí)候都運(yùn)行正常 $ mpirun -machinefile ~/machinefile -np 12 vasp > 5out 其中,mpich2,我用cpi測(cè)試,各個(gè)節(jié)點(diǎn)都OK的,并且能夠跑上百個(gè)核。 求高人指點(diǎn),為什么vasp跨節(jié)點(diǎn)的時(shí)候出現(xiàn)這樣的錯(cuò)誤?該如何解決?非常感謝啊。 另,想問下,編譯的時(shí)候,make makeparam,生成的這個(gè)makeparam是干嘛用的? |

木蟲 (正式寫手)
|
非常感謝。 嗯,NPAR我都設(shè)成了并行的核數(shù)了,感覺這個(gè)節(jié)點(diǎn)數(shù)無(wú)法估計(jì)啊,有時(shí)候任務(wù)調(diào)度系統(tǒng)分配給4個(gè)節(jié)點(diǎn),有時(shí)候分配給10個(gè)節(jié)點(diǎn)。是否不需要嚴(yán)格的節(jié)點(diǎn)數(shù)?按照它說(shuō)的近似corse的開方即可? mpi方面,我用的是mpich2,我用Mpi自帶的examples下面的cpi測(cè)試,發(fā)現(xiàn)并行都是順利完成,指定幾個(gè)節(jié)點(diǎn),輸出里面會(huì)有相應(yīng)的節(jié)點(diǎn)運(yùn)行報(bào)告,是否可以說(shuō)mpi安裝是好的? 我昨天測(cè)試運(yùn)行的時(shí)候還發(fā)現(xiàn)一個(gè)問題,有時(shí)候去提交任務(wù),-np 64之類的,任務(wù)正常,各個(gè)節(jié)點(diǎn)都會(huì)分配vasp任務(wù),然后過了一兩個(gè)小時(shí)之后,再次運(yùn)行同樣的任務(wù),vasp又出現(xiàn)上面的錯(cuò)誤了,汗,郁悶啊。 |

木蟲 (正式寫手)

榮譽(yù)版主 (著名寫手)
榮譽(yù)版主 (職業(yè)作家)
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 材料科學(xué)與工程調(diào)劑 +10 | 深V宿舍吧 2026-03-30 | 11/550 |
|
|---|---|---|---|---|
|
[考研] 一志愿 南京航空航天大學(xué) ,080500材料科學(xué)與工程學(xué)碩 +4 | @taotao 2026-03-30 | 4/200 |
|
|
[考研] 083000學(xué)碩274求調(diào)劑 +10 | Li李魚 2026-03-26 | 10/500 |
|
|
[考研] 327求調(diào)劑 +5 | 小卡不卡. 2026-03-29 | 5/250 |
|
|
[考研] 291求調(diào)劑 +8 | HanBeiNingZC 2026-03-24 | 8/400 |
|
|
[考研] 070300一志愿211,312分求調(diào)劑院校 +4 | 小黃鴨寶 2026-03-30 | 4/200 |
|
|
[考研] 化學(xué)工程085602 305分求調(diào)劑 +25 | RichLi_ 2026-03-25 | 25/1250 |
|
|
[考研] 317求調(diào)劑 +10 | 蛋黃咸肉粽 2026-03-26 | 10/500 |
|
|
[考研] 一志愿雙一流機(jī)械285分求調(diào)劑 +4 | 幸運(yùn)的三木 2026-03-29 | 5/250 |
|
|
[考研] 調(diào)劑求院校招收 +6 | 鶴鯨鴿 2026-03-28 | 6/300 |
|
|
[考研] 一志愿華理,數(shù)一英一285求A區(qū)調(diào)劑 +8 | AZMK 2026-03-25 | 12/600 |
|
|
[考研] 求調(diào)劑推薦 材料 304 +15 | 荷包蛋hyj 2026-03-26 | 15/750 |
|
|
[有機(jī)交流]
高溫高壓反應(yīng)求助
10+4
|
chibby 2026-03-25 | 4/200 |
|
|
[考研] 求調(diào)劑 +4 | 零八# 2026-03-27 | 4/200 |
|
|
[考研] 085601求調(diào)劑總分293英一數(shù)二 +4 | 鋼鐵大炮 2026-03-24 | 4/200 |
|
|
[考研] 309求調(diào)劑 +4 | gajsj 2026-03-25 | 5/250 |
|
|
[考研] 機(jī)械學(xué)碩總分317求調(diào)劑!。! +4 | Acaciad 2026-03-25 | 4/200 |
|
|
[考研] 296求調(diào)劑 +4 | 汪。! 2026-03-25 | 7/350 |
|
|
[考研] 一志愿武理085500機(jī)械專業(yè)總分300求調(diào)劑 +3 | an10101 2026-03-24 | 7/350 |
|
|
[考研] 080500求調(diào)劑 +3 | zzzzfan 2026-03-24 | 3/150 |
|