| 5 | 1/1 | 返回列表 |
| 查看: 6679 | 回復: 32 | ||||||||||||||
| 當前只顯示滿足指定條件的回帖,點擊這里查看本話題的所有回帖 | ||||||||||||||
cnlics木蟲 (小有名氣)
|
[交流]
【分享】蛋白質結構預測流程 已有23人參與
|
|||||||||||||
|
我慢慢翻譯慢慢貼 這里貼的內容是以前收集的,應該是來自EMBL,我粗略瀏覽了下內容,還沒有過時。 WORD文檔可以在這里下載: http://ifile.it/dwzy278 蛋白質結構預測一般流程見下圖: ![]() 內容目錄: •相關實驗數(shù)據(jù) •序列數(shù)據(jù)和初步分析 •搜索序列數(shù)據(jù)庫 •識別結構域 •多序列比對 •比較或同源建模 •二級結構預測 •折疊的識別 •折疊分析與二級結構比對 •序列與結構的比對 [ Last edited by cnlics on 2010-9-16 at 08:24 ] |
蛋白質生物學實驗經驗 | 分子生物實驗及蛋白純化結晶相關鏈接 | 生物信息學 | 生物化學和分子生物學 |
精品收藏 | 待下載 | 蛋白質 | 交叉知識 |
比偶長大 | 蛋白 分析軟件 | 生物信息學 |
木蟲 (小有名氣)
|
比較或同源建模 如果蛋白序列和已知三維結構的其他蛋白有顯著的相似性,就可以通過同源建模的方法獲得這個蛋白相當精確的3D結構。It is also possible to build models if you have found a suitable fold via fold recognition and are happy with the alignment of sequence to structure (Note that the accuracy of models constructed in this manner has not been assessed properly, so treat with caution). It is possible now to generate models automatically using the very useful SWISSMODEL server. Some other sites useful for homology modelling include: • WHAT IF (G. Vriend, EMBL, Heidelberg) • MODELLER (A. Sali, Rockefeller University) • MODELLER Mirror FTP site Sequence alignments, particularly those involving proteins having low percent sequence identities can be inacurrate. If this is the case, then a model built using the alignment will obvious be wrong in some places. I would suggest that you look over the alignment carefully before building a model. Note that when using SWISSMODEL it is possible to send in a protein sequence only. I would only recommend doing this if the degree of sequence homology is high (50% or greater) for the above reasons. It is best, particularly if one has edited an alignment, to send an alignment directly to the server. Once you have a three-dimensional model, it is useful to look at protein 3D structures. There are numerous free programs for doing this, including: • GRASP Anthony Nicholls, Columbia, USA. • MolMol Reto Koradi, ETH, Zurrich, C.H. • Prepi Suhail Islam, ICRF, U.K. • RasMol Roger Sayle, Glaxo, U.K. Most places with groups studying structural biology also have commercial packages, such as Quanta, SYBL or Insight, which contain more features than the visualisation packages described above. Crystallographers also tend to use O and FRODO, though these require a lot of experience to use with ease. |
木蟲 (小有名氣)
木蟲 (小有名氣)
|
蛋白序列數(shù)據(jù) 對蛋白序列的初步分析有一定價值。例如,如果蛋白是直接來自基因預測,就可能包含多個結構域。更嚴重的是,可能會包含不太可能是球形或可溶性的區(qū)域。此流程圖假設你的蛋白是可溶的,可能是一個結構域并不包含非球形結構域。 需要考慮以下方面: •是跨膜蛋白或者包含跨膜片段嗎?有許多方法預測這些片段,包括: o TMAP (EMBL) o PredictProtein (EMBL/Columbia) o TMHMM (CBS, Denmark) o TMpred (Baylor College) o DAS (Stockholm) •如果包含卷曲(coiled-coils)可以在COILS server 預測coiled coils 或者下載 COILS 程序(最近已經重寫,注意GCG程序包里包含了COILS的一個版本) •蛋白包含低復雜性區(qū)域?蛋白經常含有數(shù)個聚谷氨酸或聚絲氨酸區(qū),這些地方不容易預測?梢杂肧EG(GCG程序包里包含了一個版本的SEG程序)檢查 。 如果出現(xiàn)以上一種情況,就應該將序列打成碎片,或忽略序列中的特定區(qū)段,等等。這個問題與細胞定位結構域相關。 [ Last edited by cnlics on 2010-9-16 at 08:25 ] |
木蟲 (小有名氣)
|
搜索序列數(shù)據(jù)庫 分析任何新序列的第一步顯然是搜索序列數(shù)據(jù)庫以發(fā)現(xiàn)同源序列。這樣的搜索可以在任何地方或者在任何計算機上完成。而且,有許多WEB服務器可以進行此類搜索,可以輸入或粘貼序列到服務器上并交互式地接收結果。 序列搜索也有許多方法,目前最有名的是BLAST程序。可以容易得到在本地運行的版本(從 NCBI 或者 Washington University),也有許多的WEB頁面允許對多基因或蛋白質序列的數(shù)據(jù)庫比較蛋白質或DNA序列,僅舉幾個例子: •National Center for Biotechnology Information (USA) Searches •European Bioinformatics Institute (UK) Searches •BLAST search through SBASE (domain database; ICGEB, Trieste) •還有更多的站點 最近序列比較的重要進展是發(fā)展了gapped BLAST 和PSI-BLAST (position specific interated BLAST),二者均使BLAST更敏感,后者通過選取一條搜索結果,建立模式(profile),然后用再它搜索數(shù)據(jù)庫尋找其他同源序列(這個過程可以一直重復到發(fā)現(xiàn)不了新的序列為止),可以探測進化距離非常遠的同源序列。很重要的一點是,在利用下面章節(jié)方法之前,通過PSI-BLAST把蛋白質序列和數(shù)據(jù)庫比較,找尋是否有已知結構。 將一條序列和數(shù)據(jù)庫比較的其他方法有: •FASTA軟件包 (William Pearson, University of Virginia, USA) •SCANPS (Geoff Barton, European Bioinformatics Institute, UK) •BLITZ (Compugen's fast Smith Waterman search) •其他方法. It is also possible to use multiple sequence information to perform more sensitive searches. Essentially this involves building a profile from some kind of multiple sequence alignment. A profile essentially gives a score for each type of amino acid at each position in the sequence, and generally makes searches more sentive. Tools for doing this include: •PSI-BLAST (NCBI, Washington) •ProfileScan Server (ISREC, Geneva) •HMMER 隱馬氏模型(Sean Eddy, Washington University) •Wise package (Ewan Birney, Sanger Centre;用于蛋白質對DNA的比較) •其他方法. A different approach for incorporating multiple sequence information into a database search is to use a MOTIF. Instead of giving every amino acid some kind of score at every position in an alignment, a motif ignores all but the most invariant positions in an alignment, and just describes the key residues that are conserved and define the family. Sometimes this is called a "signature". For example, "H-[FW]-x-[LIVM]-x-G-x(5)-[LV]-H-x(3)-[DE]" describes a family of DNA binding proteins. It can be translated as "histidine, followed by either a phenylalanine or tryptophan, followed by an amino acid (x), followed by leucine, isoleucine, valine or methionine, followed by any amino acid (x), followed by glycine,... [etc.]". PROSITE (ExPASy Geneva) contains a huge number of such patterns, and several sites allow you to search these data: •ExPASy •EBI It is best to search a few different databases in order to find as many homologues as possible. A very important thing to do, and one which is sometimes overlooked, is to compare any new sequence to a database of sequences for which 3D structure information is available. Whether or not your sequence is homologous to a protein of known 3D structure is not obvious in the output from many searches of large sequence databases. Moreover, if the homology is weak, the similarity may not be apparent at all during the search through a larger database. One last thing to remember is that one can save a lot of time by making use of pre-prepared protein alignments. Many of these alignments are hand edited by experts on the particular protein families, and thus represent probably the best alignment one can get given the data they contain (i.e. they are not always as up to date as the most recent sequence databases). These databases include: •SMART (Oxford/EMBL) •PFAM (Sanger Centre/Wash-U/Karolinska Intitutet) •COGS (NCBI) •PRINTS (UCL/Manchester) •BLOCKS (Fred Hutchinson Cancer Research Centre, Seatle) •SBASE (ICGEB, Trieste) 通常把蛋白質序列和數(shù)據(jù)比較都有很多的方法,這些對于識別結構域非常有用。 [ Last edited by cnlics on 2010-9-14 at 19:54 ] |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 0856材料與化工調劑,339 +13 | 10213207 2026-03-31 | 13/650 |
|
|---|---|---|---|---|
|
[考研] 265求調劑11408 +4 | 劉小鹿lu 2026-03-27 | 4/200 |
|
|
[考研] 348環(huán)境工程調劑 +3 | 吳彥祖24k 2026-04-01 | 3/150 |
|
|
[考研] 求生物學調劑 +8 | 15172915737 2026-04-01 | 8/400 |
|
|
[考研] 318求調劑,計算材料方向 +7 | 吸喵有害笙命 2026-04-01 | 8/400 |
|
|
[考研] 265求調劑 +3 | 梁梁校校 2026-04-01 | 3/150 |
|
|
[考研] 調劑 +3 | 好好讀書。 2026-04-01 | 3/150 |
|
|
[考研] 0856,材料與化工321分求調劑 +13 | 大饞小子 2026-03-27 | 14/700 |
|
|
[考研] 土木304求調劑 +5 | 頂級擦擦 2026-03-31 | 5/250 |
|
|
[考研] 070300化學354求調劑 +15 | 101次希望 2026-03-28 | 15/750 |
|
|
[基金申請] 面上5B能上會嗎? +8 | redcom 2026-03-29 | 8/400 |
|
|
[考博] 材料專業(yè)申博 +5 | 杜雨婷dyt 2026-03-29 | 5/250 |
|
|
[考研] 262求調劑 +7 | ZZ..000 2026-03-30 | 8/400 |
|
|
[考研] 吉大生物學326分求調劑 +3 | sunnyupup 2026-03-31 | 3/150 |
|
|
[考研]
|
Gymno 2026-03-30 | 6/300 |
|
|
[考研] 一志愿南京航空航天大學材料學碩求調劑 +3 | @taotao 2026-03-28 | 3/150 |
|
|
[考研] 081200-314 +3 | LILIQQ 2026-03-27 | 4/200 |
|
|
[考研] 085405 考的11408求各位老師帶走 +3 | Qiu學ing 2026-03-28 | 3/150 |
|
|
[考研] 0856調劑 +5 | 求求讓我有書讀?/a> 2026-03-26 | 6/300 |
|
|
[考研] 復試調劑,一志愿南農083200食品科學與工程 +5 | XQTJZ 2026-03-26 | 5/250 |
|