| 2 | 1/1 | 返回列表 |
| 查看: 2046 | 回復(fù): 1 | |||
qh203銅蟲(chóng) (小有名氣)
|
[求助]
root和普通用戶(hù)下并行計(jì)算問(wèn)題
|
|
在root用戶(hù)下,用openmpi并行計(jì)算cpi 這個(gè)算例,6個(gè)節(jié)點(diǎn),每個(gè)節(jié)點(diǎn)8個(gè)cpu。輸出正常,如下 [root@node1 examples]# mpirun -np 40 -machinefile test ./cpi Process 3 on node2 Process 38 on node6 Process 18 on node4 Process 32 on node6 Process 20 on node4 Process 2 on node2 Process 35 on node6 Process 34 on node6 Process 22 on node4 Process 7 on node2 Process 23 on node4 Process 5 on node2 Process 4 on node2 Process 37 on node6 Process 33 on node6 Process 30 on node5 Process 8 on node3 Process 26 on node5 Process 10 on node3 Process 15 on node3 Process 27 on node5 Process 31 on node5 Process 28 on node5 Process 24 on node5 Process 19 on node4 Process 21 on node4 Process 17 on node4 Process 6 on node2 Process 16 on node4 Process 25 on node5 Process 9 on node3 Process 11 on node3 Process 13 on node3 Process 14 on node3 Process 0 on node2 Process 1 on node2 Process 36 on node6 Process 39 on node6 Process 12 on node3 Process 29 on node5 pi is approximately 3.1416009869231245, Error is 0.0000083333333314 wall clock time = 0.128546 在普通用戶(hù)下用openmpi并行計(jì)算cpi這個(gè)算例,輸出則變成 [aojjj@node1 examples]$ mpirun -np 40 -machinefile test ./cpi libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. -------------------------------------------------------------------------- The OpenFabrics (openib) BTL failed to register memory in the driver. Please check /var/log/messages or dmesg for driver specific failure reason. The failure occured here: Local host: mthca0 Device: openib_reg_mr Function: Cannot allocate memory() Errno says: You may need to consult with your system administrator to get this problem fixed. -------------------------------------------------------------------------- -------------------------------------------------------------------------- The OpenFabrics (openib) BTL failed to initialize while trying to allocate some locked memory. This typically can indicate that the memlock limits are set too low. For most HPC installations, the memlock limits should be set to "unlimited". The failure occured here: Local host: node4 OMPI source: btl_openib_component.c:1161 Function: ompi_free_list_init_ex_new() Device: mthca0 Memlock limit: 32768 You may need to consult with your system administrator to get this problem fixed. This FAQ entry on the Open MPI web site may also be helpful: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device. Local host: node4 Local device: mthca0 -------------------------------------------------------------------------- libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. Process 26 on node5 Process 8 on node3 Process 28 on node5 Process 1 on node2 Process 29 on node5 Process 4 on node2 Process 22 on node4 Process 2 on node2 Process 15 on node3 Process 25 on node5 Process 31 on node5 Process 38 on node6 Process 14 on node3 Process 30 on node5 Process 32 on node6 Process 39 on node6 Process 37 on node6 Process 33 on node6 Process 36 on node6 Process 35 on node6 Process 16 on node4 Process 18 on node4 Process 10 on node3 Process 21 on node4 Process 19 on node4 Process 20 on node4 Process 11 on node3 Process 17 on node4 Process 9 on node3 Process 0 on node2 Process 7 on node2 Process 6 on node2 Process 5 on node2 Process 23 on node4 Process 24 on node5 Process 3 on node2 Process 27 on node5 Process 34 on node6 Process 12 on node3 Process 13 on node3 pi is approximately 3.1416009869231245, Error is 0.0000083333333314 wall clock time = 3.002147 [node1:02112] 39 more processes have sent help message help-mpi-btl-openib.txt / mem-reg-fail [node1:02112] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages [node1:02112] 36 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem [node1:02112] 39 more processes have sent help message help-mpi-btl-openib.txt / error in device init 也計(jì)算出來(lái)了,但是多了許多warniing 和error的提示。 在各個(gè)節(jié)點(diǎn)修改了/etc/security/limits.conf 和/etc/init.d/sshd, 還是不行。 到底問(wèn)題在哪里? |
銅蟲(chóng) (小有名氣)
|
這個(gè)問(wèn)題我自己已經(jīng)解決了。 普通用戶(hù)的memlock不夠。root用戶(hù)下,在每個(gè)節(jié)點(diǎn)的/etc/security/limits.conf文件里增加兩行 某個(gè)普通用戶(hù)名 soft memlock unlimited 某個(gè)普通用戶(hù)名 hard memlock unlimited 然后要重啟每個(gè)服務(wù)器節(jié)點(diǎn)。<----這一點(diǎn)很重要,否則切換到普通用戶(hù)下,會(huì)出現(xiàn) memlock cannot modify limit: Operation not permitte. |
| 2 | 1/1 | 返回列表 |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 085602 307分 求調(diào)劑 +6 | 不知道叫什么! 2026-03-26 | 6/300 |
|
|---|---|---|---|---|
|
[考研] 295求調(diào)劑 +4 | 1428151015 2026-03-27 | 5/250 |
|
|
[考研] 一志愿南師大0703化學(xué) 275求調(diào)劑 +4 | Ripcord上岸 2026-03-27 | 4/200 |
|
|
[考博] 26申博 +3 | 加油沖啊! 2026-03-26 | 3/150 |
|
|
[考研] 0703一志愿9,初試成績(jī):338,四六級(jí)已過(guò),有科研經(jīng)歷,求調(diào)劑! +3 | Zuhui0306 2026-03-25 | 3/150 |
|
|
[考研] 298調(diào)劑 +3 | jiyingjie123 2026-03-27 | 3/150 |
|
|
[碩博家園] 招收生物學(xué)/細(xì)胞生物學(xué)調(diào)劑 +3 | IceGuo 2026-03-26 | 4/200 |
|
|
[考研] 325求調(diào)劑 +5 | 李嘉圖·S·路 2026-03-23 | 5/250 |
|
|
[考研] 336材料求調(diào)劑 +7 | 陳瀅瑩 2026-03-26 | 9/450 |
|
|
[考研] 281求調(diào)劑 +3 | 亞克西good 2026-03-26 | 5/250 |
|
|
[考研] 化學(xué)工程085602 305分求調(diào)劑 +17 | RichLi_ 2026-03-25 | 17/850 |
|
|
[考研] 生物學(xué)學(xué)碩,一志愿湖南大學(xué),初試成績(jī)338 +4 | YYYYYNNNNN 2026-03-26 | 4/200 |
|
|
[考研] 086000生物與醫(yī)藥292求調(diào)劑 +6 | 小小陳小小 2026-03-22 | 9/450 |
|
|
[考研] 求b區(qū)院校調(diào)劑 +4 | 周56 2026-03-24 | 5/250 |
|
|
[考研] 347求調(diào)劑 +4 | L when 2026-03-25 | 4/200 |
|
|
[考研] 292求調(diào)劑 +4 | 鵝鵝鵝額額額額?/a> 2026-03-24 | 4/200 |
|
|
[考研] 335求調(diào)劑 +4 | yuyu宇 2026-03-23 | 5/250 |
|
|
[考研] 工科0856求調(diào)劑 +5 | 沐析汀汀 2026-03-21 | 5/250 |
|
|
[考研] 接收2026碩士調(diào)劑(學(xué)碩+專(zhuān)碩) +4 | allen-yin 2026-03-23 | 6/300 |
|
|
[考研] 石河子大學(xué)(211、雙一流)碩博研究生長(zhǎng)期招生公告 +3 | 李子目 2026-03-22 | 3/150 |
|