| 6 | 1/1 | 返回列表 |
| 查看: 3316 | 回復: 5 | ||
04nylxb木蟲 (正式寫手)
|
[求助]
vasp跨節(jié)點運行出錯,mpiexec_node-1 (handle_stdin_input 1089)
|
|
最近在集群上編譯帶CNEB的vasp5.2,并行vasp編譯成功,在單個節(jié)點(每個節(jié)點八核)上運行 $ mpirun -np 8 vasp 時候,top下,發(fā)現(xiàn)確實出現(xiàn)八個vasp進程。 但是,跨節(jié)點的時候,確出錯了,出錯信息如下: running on 15 nodes distr: one band on 1 nodes, 15 groups vasp.5.2.12 11Nov11 complex POSCAR found : 1 types and 2 ions ----------------------------------------------------------------------------- | | | W W AA RRRRR N N II N N GGGG !!! | | W W A A R R NN N II NN N G G !!! | | W W A A R R N N N II N N N G !!! | | W WW W AAAAAA RRRRR N N N II N N N G GGG ! | | WW WW A A R R N NN II N NN G G | | W W A A R R N N II N N GGGG !!! | | | | For optimal performance we recommend that you set | | NPAR = approx SQRT( number of cores) | | This will greatly improve the performance of VASP for DFT. | | The default NPAR=number of cores might be grossly inefficient | | on modern multi-core architectures or massively parallel machines. | | Unfortunately you need to use the default for hybrid, GW and RPA | | calculations. | | | ----------------------------------------------------------------------------- LDA part: xc-table for Pade appr. of Perdew found WAVECAR, reading the header number of bands has changed, file: 12 present: 15 trying to continue reading WAVECAR, but it might fail POSCAR, INCAR and KPOINTS ok, starting setup WARNING: small aliasing (wrap around) errors must be expected FFT: planning ...( 1 ) reading WAVECAR random initialization beyond band 13 the WAVECAR file was read sucessfully initial charge from wavefunction entering main loop N E dE d eps ncg rms rms(c) mpiexec_node-1 (handle_stdin_input 1089): stdin problem; if pgm is run in background, redirect from /dev/null mpiexec_node-1 (handle_stdin_input 1090): e.g.: mpiexec -n 4 a.out < /dev/null & rank 14 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 14: killed by signal 11 rank 13 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 13: killed by signal 9 rank 9 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 9: killed by signal 11 rank 8 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 8: killed by signal 11 rank 4 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 4: killed by signal 11 rank 3 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 3: killed by signal 9 rank 2 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 2: killed by signal 9 rank 1 in job 14 node-1_49061 caused collective abort of all ranks exit status of rank 1: killed by signal 11 rank 0 in job 14 node-1_49061 caused collective abort of all ranks 其中node-1是我的控制節(jié)點。進程數(shù)為12以下的時候都運行正常 $ mpirun -machinefile ~/machinefile -np 12 vasp > 5out 其中,mpich2,我用cpi測試,各個節(jié)點都OK的,并且能夠跑上百個核。 求高人指點,為什么vasp跨節(jié)點的時候出現(xiàn)這樣的錯誤?該如何解決?非常感謝啊。 另,想問下,編譯的時候,make makeparam,生成的這個makeparam是干嘛用的? |

木蟲 (正式寫手)

榮譽版主 (著名寫手)
榮譽版主 (職業(yè)作家)
木蟲 (正式寫手)
|
非常感謝。 嗯,NPAR我都設成了并行的核數(shù)了,感覺這個節(jié)點數(shù)無法估計啊,有時候任務調(diào)度系統(tǒng)分配給4個節(jié)點,有時候分配給10個節(jié)點。是否不需要嚴格的節(jié)點數(shù)?按照它說的近似corse的開方即可? mpi方面,我用的是mpich2,我用Mpi自帶的examples下面的cpi測試,發(fā)現(xiàn)并行都是順利完成,指定幾個節(jié)點,輸出里面會有相應的節(jié)點運行報告,是否可以說mpi安裝是好的? 我昨天測試運行的時候還發(fā)現(xiàn)一個問題,有時候去提交任務,-np 64之類的,任務正常,各個節(jié)點都會分配vasp任務,然后過了一兩個小時之后,再次運行同樣的任務,vasp又出現(xiàn)上面的錯誤了,汗,郁悶啊。 |

榮譽版主 (職業(yè)作家)
| 6 | 1/1 | 返回列表 |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[碩博家園] 湖北工業(yè)大學 生命科學與健康學院-課題組招收2026級食品/生物方向碩士 +3 | 1喜春8 2026-03-17 | 5/250 |
|
|---|---|---|---|---|
|
[考研] 085601材料工程專碩求調(diào)劑 +4 | 慕寒mio 2026-03-16 | 4/200 |
|
|
[考研] 本人考085602 化學工程 專碩 +16 | 不知道叫什么! 2026-03-15 | 18/900 |
|
|
[考研] 293求調(diào)劑 +6 | 世界首富 2026-03-11 | 6/300 |
|
|
[考研] 一志愿蘇州大學材料工程(085601)專碩有科研經(jīng)歷三項國獎兩個實用型專利一項省級立項 +6 | 大火山小火山 2026-03-16 | 8/400 |
|
|
[碩博家園] 深圳大學碩士招生(2026秋,傳感器方向,僅錄取第一志愿) +4 | xujiaoszu 2026-03-11 | 9/450 |
|
|
[考研] 考研調(diào)劑 +3 | 淇ya_~ 2026-03-17 | 5/250 |
|
|
[考研] 267一志愿南京工業(yè)大學0817化工求調(diào)劑 +6 | SUICHILD 2026-03-12 | 6/300 |
|
|
[考研] 東南大學364求調(diào)劑 +5 | JasonYuiui 2026-03-15 | 5/250 |
|
|
[考研] 304求調(diào)劑 +5 | 素年祭語 2026-03-15 | 5/250 |
|
|
[考研] 080500,材料學碩302分求調(diào)劑學校 +4 | 初識可樂 2026-03-14 | 5/250 |
|
|
[考研] 297求調(diào)劑 +4 | 學海漂泊 2026-03-13 | 4/200 |
|
|
[考研] 308 085701 四六級已過求調(diào)劑 +7 | 溫喬喬喬喬 2026-03-12 | 14/700 |
|
|
[考研] 330求調(diào)劑 +3 | ?醬給調(diào)劑跪了 2026-03-13 | 3/150 |
|
|
[考研] 26調(diào)劑/材料科學與工程/總分295/求收留 +9 | 2026調(diào)劑俠 2026-03-12 | 9/450 |
|
|
[考研] 311求調(diào)劑 +3 | 冬十三 2026-03-13 | 3/150 |
|
|
[考研] 328化工專碩求調(diào)劑 +4 | 。,。,。,。i 2026-03-12 | 4/200 |
|
|
[考研] 求調(diào)劑 資源與環(huán)境 285 +3 | 未名考生 2026-03-10 | 3/150 |
|
|
[考研] 化工學碩306求調(diào)劑 +9 | 42838695 2026-03-12 | 9/450 |
|
|
[考研] 333求調(diào)劑 +3 | 152697 2026-03-12 | 4/200 |
|