| 3 | 1/1 | 返回列表 |
| 查看: 657 | 回復(fù): 2 | |||
| 當(dāng)前主題已經(jīng)存檔。 | |||
wuchenwf榮譽(yù)版主 (職業(yè)作家)
|
[交流]
安裝mpich2成功后 運(yùn)行mpdtrace 出現(xiàn)的問題 謝謝各位了
|
||
|
各位前輩,大家好。小弟我的機(jī)器是4核intel的機(jī)器 采用ifort 安裝mpich2出現(xiàn)文件夾 以及所需要的 文件如 mpd 等 運(yùn)行mpd 成功 但是運(yùn)行mpdtrace后出現(xiàn)錯誤,以下是我的操作內(nèi)容和錯誤內(nèi)容(其中我的機(jī)器名為wuchenwf-desktop mpich2安裝路徑為/opt/mpich2) --------------------------------------------------------------------- root@wuchenwf-desktop:~# mpd & [1] 8542 root@wuchenwf-desktop:~# mpdtrace mpdtrace (send_dict_msg 426):send_dict_msg: sock= errmsg= 32, 'Broken pipe'): mpdtb: /opt/mpich2/bin/mpdlib.py, 426, send_dict_msg /opt/mpich2/bin/mpdtrace, 51, mpdtrace /opt/mpich2/bin/mpdtrace, 83, mpdtrace: unexpected msg from mpd=:{'error_msg': 'invalid secretword to root mpd'}: -------------------------------------------------------------------------- 我看好象是兩個錯誤,而且運(yùn)行mpdallexit也出現(xiàn)相近問題 ,請問這個錯誤該如何解決,麻煩各位了 十分感謝 |
鐵桿木蟲 (正式寫手)
老木蟲
|
收藏下面這篇文章對MPICH2的用戶很有用。 附:mpich2運(yùn)行mpd錯誤debug 1. Install mpich2, and thus mpd. 2. Make sure the mpich2 bin directory is in your path. Below, we will refer to it as MPDDIR. 3. Kill old mpd processes. If you are coming to this guide from elsewhere, e.g. a Quick Start guide for mpich2, because you encountered mpd problems, you should make sure that all mpd processes are terminated on the hosts where you have been testing. mpdallexit may assist in this, but probably not if you were having problems. You may need to use the Unix kill command to terminate the processes. 4. Run a first mpd (alone on a first node). As mentioned above, mpd uses client-server communications to perform its work. So, before running an mpd, let's run a simpler program (mpdcheck) to verify that these communications are likely to be successful. Even on hosts where communications are well supported, sometimes there are problems associated with hostname resolution, etc. So, it is worth the effort to proceed a bit slowly. Below, we assume that you have installed mpd and have it in your path. Select a test node, let's call it n1. Login to n1. First, we will run mpdcheck as a server and a client. To run it as a server, get into a window with a command-line and run this: n1 $ mpdcheck -s It will print something like this: server listening at INADDR_ANY on: n1 1234 Now, run the client side (in another window if convenient) and see if it can find the server and communicate. Be sure to use the same hostname and portnumber printed by the server (above: n1 1234): n1 $ mpdcheck -c n1 1234 If all goes well, the server will print something like: server has conn on from ('192.168.1.1', 1234) server successfully recvd msg from client: hello_from_client_to_server A TROUBLESHOOTING MPDS 29 and the client will print: client successfully recvd ack from server: ack_from_server_to_client If the experiment failed, you have some network or machine configuration problem which will also be a problem later when you try to use mpd. Even if the experiment succeeded, but the hostname printed by the server was localhost, then you will probably have problems later if you try to use mpd on n1 in conjunction with other hosts. In either case, skip to Section A.2 "Debugging host/network configuration problems." If the experiment succeeded, then you should be ready to try mpd on this one host. To start an mpd, you will use the mpd command. To run parallel programs, you will use the mpiexec program. All mpd commands accept the -h or -help arguments, e.g.: n1 $ mpd --help n1 $ mpiexec --help Try a few tests: n1 $ mpd & n1 $ mpiexec -n 1 /bin/hostname n1 $ mpiexec -l -n 4 /bin/hostname n1 $ mpiexec -n 2 PATH_TO_MPICH2_EXAMPLES/cpi where PATH TO MPICH2 EXAMPLES is the path to the mpich2-1.0.3/examples directory. To terminate the mpd: n1 $ mpdallexit 5. Run a second mpd (alone on a second node). To verify that things are fine on a second host (say n2 ), login to n2 and perform the same set of tests that you did on n1. Make sure that you use mpdallexit to terminate the mpd so you will be ready for further tests. A TROUBLESHOOTING MPDS 30 6. Run a ring of two mpds on two hosts. Before running a ring of mpds on n1 and n2, we will again use mpdcheck, but this time between the two machines. We do this because the two nodes may have trouble locating each other or communicating between them and it is easier to check this out with the smaller program. First, we will make sure that a server on n1 can service a client from n2. On n1: n1 $ mpdcheck -s which will print a hostname (hopefully n1) and a portnumber (say 3333 here). On n2: n2 $ mpdcheck -c n1 3333 If this experiment fails, skip to Section A.2 "Debugging host/network configuration problems". Second, we will make sure that a server on n2 can service a client from n1. On n2: n2 $ mpdcheck -s which will print a hostname (hopefully n2) and a portnumber (say 7777 here). On n2: n2 $ mpdcheck -c n2 7777 If this experiment fails, skip to Section A.2 "Debugging host/network configuration problems". If all went well, we are ready to try a pair of mpds on n1 and n2. First, make sure that all mpds have terminated on both n1 and n2. Use mpdallexit or simply kill them with: kill -9 PID_OF_MPD where you have obtained the PID OF MPD by some means such as the ps command. On n1: A TROUBLESHOOTING MPDS 31 n1 $ mpd & n1 $ mpdtrace -l This will print a list of machines in the ring, in this case just n1. The output will be something like: n1_6789 (192.168.1.1) The 6789 is the port that the mpd is listeneing on for connections from other mpds wishing to enter the ring. We will use that port in a moment to get an mpd from n2 into the ring. The value in parentheses should be the IP address of n1. On n2: n2 $ mpd -h n1 -p 6789 & where 6789 is the listening port on n1 (from mpdtrace above). Now try: n2 $ mpdtrace -l You should see both mpds in the ring. To run some programs in parallel: n1 $ mpiexec -n 2 /bin/hostname n1 $ mpiexec -n 4 /bin/hostname n1 $ mpiexec -l -n 4 /bin/hostname n1 $ mpiexec -l -n 4 PATH_TO_MPICH2_EXAMPLES/cpi where PATH TO MPICH2 EXAMPLES is the path to the mpich2-1.0.5/examples directory. To bring down the ring of mpds: n1 $ mpdallexit 7. Boot a ring of two mpds via mpdboot. Please be aware that mpdboot uses ssh by default to start remote mpds. It will expect that you can run ssh from n1 to n2 (and from n2 to n1) without entering a password. First, make sure that you terminate the mpd processes from any prior tests. On n1, create a file named mpd.hosts containing the name of n2: A TROUBLESHOOTING MPDS 32 n2 Then, on n1 run: n1 $ mpdboot -n 2 n1 $ mpdtrace -l n1 $ mpiexec -l -n 2 /bin/hostname The mpdboot command should read the mpd.hosts file created above and run an mpd on each of the two machines. The mpdtrace and mpiexec show the ring up and functional. Options that may be useful are: · --help use this one for extra details on all options · -v (verbose) · --chkup tries to verify that the hosts are up before starting mpds · --chkuponly only performs the verify step, then ends To bring the ring down: n1 $ mpdallexit If mpdboot works on the two machines n1 and n2, it will probably work on your others as well. But, there could be configuration problems using a new machine on which you have not yet tested mpd. An easy way to check, is to gradually add them to mpd.hosts and try an mpdboot with a -n arg that uses them all each time. Use mpdallexit after each test. [ Last edited by alwens on 2008-1-18 at 14:32 ] |

| 3 | 1/1 | 返回列表 |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 一志愿天大材料與化工(085600)總分338 +4 | 蔡大美女 2026-03-13 | 4/200 |
|
|---|---|---|---|---|
|
[考研] 一志愿華中科技大學(xué),080502,354分求調(diào)劑 +4 | 守候夕陽CF 2026-03-18 | 4/200 |
|
|
[考研] 085600材料與化工調(diào)劑 324分 +8 | llllkkkhh 2026-03-18 | 8/400 |
|
|
[考研] 328求調(diào)劑,英語六級551,有科研經(jīng)歷 +3 | 生物工程調(diào)劑 2026-03-16 | 10/500 |
|
|
[考研] 295求調(diào)劑 +3 | 一志愿京區(qū)211 2026-03-18 | 5/250 |
|
|
[考研] 材料專碩306英一數(shù)二 +10 | z1z2z3879 2026-03-16 | 13/650 |
|
|
[考研]
|
胡辣湯放糖 2026-03-15 | 6/300 |
|
|
[考研] 工科材料085601 279求調(diào)劑 +6 | 困于星晨 2026-03-17 | 6/300 |
|
|
[考研] 268求調(diào)劑 +6 | 簡單點(diǎn)0 2026-03-17 | 6/300 |
|
|
[考研] 293求調(diào)劑 +11 | zjl的號 2026-03-16 | 16/800 |
|
|
[考研] 283求調(diào)劑 +3 | 聽風(fēng)就是雨; 2026-03-16 | 3/150 |
|
|
[考研]
|
zhouzhen654 2026-03-16 | 3/150 |
|
|
[基金申請]
今年的國基金是打分制嗎?
50+3
|
zhanghaozhu 2026-03-14 | 3/150 |
|
|
[考研] 0856專碩279求調(diào)劑 +5 | 加油加油!? 2026-03-15 | 5/250 |
|
|
[考研] 本科南京大學(xué)一志愿川大藥學(xué)327 +3 | 麥田耕者 2026-03-14 | 3/150 |
|
|
[考研] 一志愿哈工大材料324分求調(diào)劑 +5 | 閆旭東 2026-03-14 | 5/250 |
|
|
[考研] 281求調(diào)劑 +9 | Koxui 2026-03-12 | 11/550 |
|
|
[考研] 工科278分求調(diào)劑 +5 | 周慢熱啊 2026-03-12 | 7/350 |
|
|
[考研] 0817化學(xué)工程與技術(shù)考研312分調(diào)劑 +3 | T123 tt 2026-03-12 | 3/150 |
|
|
[考研] 321求調(diào)劑(食品/專碩) +3 | xc321 2026-03-12 | 6/300 |
|