Knowledge (XXG)

Homology modeling

Source đź“ť

468:
solvation term, is also applied extensively in model evaluation. Other methods include the Errat program (Colovos and Yeates 1993), which considers distributions of nonbonded atoms according to atom type and distance, and the energy strain method (Maiorov and Abagyan 1998), which uses differences from average residue energies in different environments to indicate which parts of a protein structure might be problematic. Melo and Feytmans (1998) use an atomic pairwise potential and a surface-based solvation potential (both knowledge-based) to evaluate protein structures. Apart from the energy strain method, which is a semiempirical approach based on the ECEPP3 force field (Nemethy et al. 1992), all of the local methods listed above are based on statistical potentials. A conceptually distinct approach is the ProQres method, which was very recently introduced by Wallner and Elofsson (2006). ProQres is based on a neural network that combines structural features to distinguish correct from incorrect regions. ProQres was shown to outperform earlier methodologies based on statistical approaches (Verify3D, ProsaII, and Errat). The data presented in Wallner and Elofsson's study suggests that their machine-learning approach based on structural features is indeed superior to statistics-based methods. However, the knowledge-based methods examined in their work, Verify3D (Luthy et al. 1992; Eisenberg et al. 1997), Prosa (Sippl 1993), and Errat (Colovos and Yeates 1993), are not based on newer statistical potentials.
460:
statistical potentials (Sippl 1995; Melo and Feytmans 1998; Samudrala and Moult 1998; Rojnuckarin and Subramaniam 1999; Lu and Skolnick 2001; Wallqvist et al. 2002; Zhou and Zhou 2002), residue environments (Luthy et al. 1992; Eisenberg et al. 1997; Park et al. 1997; Summa et al. 2005), local side-chain and backbone interactions (Fang and Shortle 2005), orientation-dependent properties (Buchete et al. 2004a,b; Hamelryck 2005), packing estimates (Berglund et al. 2004), solvation energy (Petrey and Honig 2000; McConkey et al. 2003; Wallner and Elofsson 2003; Berglund et al. 2004), hydrogen bonding (Kortemme et al. 2003), and geometric properties (Colovos and Yeates 1993; Kleywegt 2000; Lovell et al. 2003; Mihalek et al. 2003). A number of methods combine different potentials into a global score, usually using a linear combination of terms (Kortemme et al. 2003; Tosatto 2005), or with the help of machine learning techniques, such as neural networks (Wallner and Elofsson 2003) and support vector machines (SVM) (Eramian et al. 2006). Comparisons of different global model quality assessment programs can be found in recent papers by Pettitt et al. (2005), Tosatto (2005), and Eramian et al. (2006).
555:, or a sequence alignment produced on the basis of comparing two solved structures, dramatically reduces the errors in final models; these "gold standard" alignments can be used as input to current modeling methods to produce quite accurate reproductions of the original experimental structure. Results from the most recent CASP experiment suggest that "consensus" methods collecting the results of multiple fold recognition and multiple alignment searches increase the likelihood of identifying the correct template; similarly, the use of multiple templates in the model-building step may be worse than the use of the single correct template but better than the use of a single suboptimal one. Alignment errors may be minimized by the use of a multiple alignment even if only one template is used, and by the iterative refinement of local regions of low similarity. A lesser source of model errors are errors in the template structure. The 464:
refined, which should be considered for modeling by multiple templates, and which should be predicted ab initio. Information on local model quality could also be used to reduce the combinatorial problem when considering alternative alignments; for example, by scoring different local models separately, fewer models would have to be built (assuming that the interactions between the separate regions are negligible or can be estimated separately).
518:. This error is comparable to the typical resolution of a structure solved by NMR. In the 30–50% identity range, errors can be more severe and are often located in loops. Below 30% identity, serious errors occur, sometimes resulting in the basic fold being mis-predicted. This low-identity region is often referred to as the "twilight zone" within which homology modeling is extremely difficult, and to which it is possibly less suited than 31: 256:-value should generally not be chosen, even if it is the only one available, since it may well have a wrong structure, leading to the production of a misguided model. A better approach is to submit the primary sequence to fold-recognition servers or, better still, consensus meta-servers which improve upon individual fold-recognition servers by identifying similarities (consensus) among independent predictions. 419:; they are the most susceptible to major modeling errors and occur with higher frequency when the target and template have low sequence identity. The coordinates of unmatched sections determined by loop modeling programs are generally much less accurate than those obtained from simply copying the coordinates of a known structure, particularly if the loop is longer than 10 residues. The first two sidechain 201:
production of sequence alignments; however, these alignments may not be of sufficient quality because database search techniques prioritize speed over alignment quality. These processes can be performed iteratively to improve the quality of the final model, although quality assessments that are not dependent on the true target structure are still under development.
147:, derive from errors in the initial sequence alignment and from improper template selection. Like other methods of structure prediction, current practice in homology modeling is assessed in a biennial large-scale experiment known as the Critical Assessment of Techniques for Protein Structure Prediction, or Critical Assessment of Structure Prediction ( 484:) is a community-wide prediction experiment that runs every two years during the summer months and challenges prediction teams to submit structural models for a number of sequences whose structures have recently been solved experimentally but have not yet been published. Its partner Critical Assessment of Fully Automated Structure Prediction ( 1641:
Ursula Pieper, Narayanan Eswar, Hannes Braberg, M.S. Madhusudhan, Fred Davis, Ashley C. Stuart, Nebojsa Mirkovic, Andrea Rossi, Marc A. Marti-Renom, Andras Fiser, Ben Webb, Daniel Greenblatt, Conrad Huang, Tom Ferrin, Andrej Sali. MODBASE, a database of annotated comparative protein structure models,
525:
At high sequence identities, the primary source of error in homology modeling derives from the choice of the template or templates on which the model is based, while lower identities exhibit serious errors in sequence alignment that inhibit the production of high-quality models. It has been suggested
398:
studies, which provide low-resolution information that is not usually itself sufficient to generate atomic-resolution structural models. To address the problem of inaccuracies in initial target-template sequence alignment, an iterative procedure has also been introduced to refine the alignment on the
259:
Often several candidate template structures are identified by these approaches. Although some methods can generate hybrid models with better accuracy from multiple templates, most methods rely on a single template. Therefore, choosing the best template from among the candidates is a key step, and can
178:
There are exceptions to the general rule that proteins sharing significant sequence identity will share a fold. For example, a judiciously chosen set of mutations of less than 50% of a protein can cause the protein to adopt a completely different fold. However, such a massive structural rearrangement
598:
states of side chains and their internal packing arrangement also present difficulties in homology modeling, even in targets for which the backbone structure is relatively easy to predict. This is partly due to the fact that many side chains in crystal structures are not in their "optimal" rotameric
586:
where high local flexibility increases the difficulty of resolving the region by structure-determination methods. Although some guidance is provided even with a single template by the positioning of the ends of the missing region, the longer the gap, the more difficult it is to model. Loops of up to
135:
conclusions about the biochemistry of the query sequence, especially in formulating hypotheses about why certain residues are conserved, which may in turn lead to experiments to test those hypotheses. For example, the spatial arrangement of conserved residues may suggest whether a particular residue
87:
The quality of the homology model is dependent on the quality of the sequence alignment and template structure. The approach can be complicated by the presence of alignment gaps (commonly called indels) that indicate a structural region present in the target but not in the template, and by structure
581:
mutation or a gap in a solved structure result in a region of target sequence for which there is no corresponding template. This problem can be minimized by the use of multiple templates, but the method is complicated by the templates' differing local structures around the gap and by the likelihood
318:
where the majority of the sequence differences were localized. Thus unsolved proteins could be modeled by first constructing the conserved core and then substituting variable regions from other proteins in the set of solved structures. Current implementations of this method differ mainly in the way
277:
defined pairwise alignments between the target sequence and a single identified template as a means of exploring "alignment space" in regions of sequence with low local similarity. "Profile-profile" alignments that first generate a sequence profile of the target and systematically compare it to the
204:
Optimizing the speed and accuracy of these steps for use in large-scale automated structure prediction is a key component of structural genomics initiatives, partly because the resulting volume of data will be too large to process manually and partly because the goal of structural genomics requires
467:
One of the most widely used local scoring methods is Verify3D (Luthy et al. 1992; Eisenberg et al. 1997), which combines secondary structure, solvent accessibility, and polarity of residue environments. ProsaII (Sippl 1993), which is based on a combination of a pairwise statistical potential and a
463:
Less work has been reported on the local quality assessment of models. Local scores are important in the context of modeling because they can give an estimate of the reliability of different regions of a predicted structure. This information can be used in turn to determine which regions should be
268:
of the aligned regions: the fraction of the query sequence structure that can be predicted from the template, and the plausibility of the resulting model. Thus, sometimes several homology models are produced for a single query sequence, with the most likely candidate chosen only in the final step.
459:
A large number of methods have been developed for selecting a native-like structure from a set of models. Scoring functions have been based on both molecular mechanics energy functions (Lazaridis and Karplus 1999; Petrey and Honig 2000; Feig and Brooks 2002; Felts et al. 2002; Lee and Duan 2004),
118:
packing and position also increase with decreasing identity, and variations in these packing configurations have been suggested as a major reason for poor model quality at low identity. Taken together, these various atomic-position errors are significant and impede the use of homology models for
492:
and EVA run continuously to assess participating servers' performance in prediction of imminently released structures from the PDB. CASP and CAFASP serve mainly as evaluations of the state of the art in modeling, while the continuous assessments seek to evaluate the model quality that would be
200:
The homology modeling procedure can be broken down into four sequential steps: template selection, target-template alignment, model construction, and model assessment. The first two steps are often essentially performed together, as the most common methods of identifying templates rely on the
603:
and in the packing of the individual molecules in a protein crystal. One method of addressing this problem requires searching a rotameric library to identify locally low-energy combinations of packing states. It has been suggested that a major reason that homology modeling so difficult when
541:
parameterizations may not be sufficiently accurate for this task, since homology models used as starting structures for molecular dynamics tend to produce slightly worse structures. Slight improvements have been observed in cases where significant restraints were used during the simulation.
187:
properly and carry out its function in the cell. Consequently, the roughly folded structure of a protein (its "topology") is conserved longer than its amino-acid sequence and much longer than the corresponding DNA sequence; in other words, two proteins may share a similar fold even if their
76:
Evolutionarily related proteins have similar sequences and naturally occurring homologous proteins have similar protein structure. It has been shown that three-dimensional protein structure is evolutionarily more conserved than would be expected on the basis of sequence conservation alone.
632:. Even low-accuracy homology models can be useful for these purposes, because their inaccuracies tend to be located in the loops on the protein surface, which are normally more variable even between closely related proteins. The functional regions of the protein, especially its 167:. Thus, even proteins that have diverged appreciably in sequence but still share detectable similarity will also share common structural properties, particularly the overall fold. Because it is difficult and time-consuming to obtain experimental structures from methods such as 72:
that maps residues in the query sequence to residues in the template sequence. It has been seen that protein structures are more conserved than protein sequences amongst homologues, but sequences falling below a 20% sequence identity can have very different structure.
248:-value, which are considered sufficiently close in evolution to make a reliable homology model. Other factors may tip the balance in marginal cases; for example, the template may have a function similar to that of the query sequence, or it may belong to a homologous 213:
The critical first step in homology modeling is the identification of the best template structure, if indeed any are available. The simplest method of template identification relies on serial pairwise sequence alignments aided by database search techniques such as
230:
to successively identify more distantly related homologs. This family of methods has been shown to produce a larger number of potential templates and to identify better templates for sequences that have only distant relationships to any solved structure.
243:
are more sensitive than purely sequence(profile)-based methods when only distantly-related templates are available for the proteins under prediction. When performing a BLAST search, a reliable first approach is to identify hits with a sufficiently low
272:
It is possible to use the sequence alignment generated by the database search technique as the basis for the subsequent model production; however, more sophisticated approaches have also been explored. One proposal generates an ensemble of
1961:
Gopal, S; Schroeder, M; Pieper, U; Sczyrba, A; Aytekin-Kurban, G; Bekiranov, S; Fajardo, JE; Eswar, N; Sanchez, R; et al. (2001). "Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome".
1796:
Flohil, JA; Vriend, G; Berendsen, HJ. (2002). "Completion and refinement of 3-D homology models with restricted molecular dynamics: application to targets 47, 58, and 111 in the CASP modeling competition and posterior analysis".
671:, resulting in nearly 1000 quality models for proteins whose structures had not yet been determined at the time of the study, and identifying novel relationships between 236 yeast proteins and other previously solved structures. 260:
affect the final accuracy of the structure significantly. This choice is guided by several factors, such as the similarity of the query and template sequences, of their functions, and of the predicted query and observed template
488:) has run in parallel with CASP but evaluates only models produced via fully automated servers. Continuously running experiments that do not have prediction 'seasons' focus mainly on benchmarking publicly available webservers. 501:
The accuracy of the structures generated by homology modeling is highly dependent on the sequence identity between target and template. Above 50% sequence identity, models tend to be reliable, with only minor errors in
550:
The two most common and large-scale sources of error in homology modeling are poor template selection and inaccuracies in target-template sequence alignment. Controlling for these two factors by using a
107:
agreement at 25% sequence identity. However, the errors are significantly higher in the loop regions, where the amino acid sequences of the target and template proteins may be completely different.
143:
consortium dedicated to the production of representative experimental structures for all classes of protein folds. The chief inaccuracies in homology modeling, which worsen with lower
175:
for every protein of interest, homology modeling can provide useful structural models for generating hypotheses about a protein's function and directing further experimental work.
447:) can cause relatively large errors in the positions of the atoms at the terminus of side chain; such atoms often have a functional importance, particularly when located near the 639:
Homology models can also be used to identify subtle differences between related proteins that have not all been solved structurally. For example, the method was used to identify
235:, also known as fold recognition or 3D-1D alignment, can also be used as a search technique for identifying templates to be used in traditional homology modeling methods. Recent 335:. Thus, sequence alignment is done over segments rather than over the entire protein. Selection of the template for each segment is based on sequence similarity, comparisons of 68:"). Homology modeling relies on the identification of one or more known protein structures likely to resemble the structure of the query sequence, and on the production of a 949:
Venclovas C, Margeleviĉius M (2005). "Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment".
290:
Given a template and an alignment, the information contained therein must be used to generate a three-dimensional structural model of the target, represented as a set of
530:
between two proteins of known structure can be used as input to current modeling methods to produce quite accurate reproductions of the original experimental structure.
386:
This method had been dramatically expanded to apply specifically to loop modeling, which can be extremely difficult due to the high flexibility of loops in proteins in
314:
identified a sharp distinction between "core" structural regions conserved in all experimental structures in the class, and variable regions typically located in the
355:
The most common current homology modeling method takes its inspiration from calculations required to construct a three-dimensional structure from data generated by
1558:
Topf, M; Baker, ML; Marti-Renom, MA; Chiu, W; Sali, A. (2006). "Refinement of protein structures by iterative comparative modeling and CryoEM density fitting".
881:
Yousif, Ragheed Hussam, et al. "Exploring the Molecular Interactions between Neoculin and the Human Sweet Taste Receptors through Computational Approaches."
587:
about 9 residues can be modeled with moderate accuracy in some cases if the local alignment is correct. Larger regions are often modeled individually using
511: 613: 566:
database lists several million, mostly very small but occasionally dramatic, errors in experimental (template) structures that have been deposited in the
604:
target-template sequence identity lies below 30% is that such proteins have broadly similar folds but widely divergent side chain packing arrangements.
582:
that a missing region in one experimental structure is also missing in other structures of the same protein family. Missing regions are most common in
139:
Homology modeling can produce high-quality structural models when the target and template are closely related, which has inspired the formation of a
136:
is conserved to stabilize the folding, to participate in binding some small molecule, or to foster association with another protein or nucleic acid.
654:
simulations, homology models can also generate hypotheses about the kinetics and dynamics of a protein, as in studies of the ion selectivity of a
480:
efforts have been made to assess the relative quality of various current homology modeling methods. Critical Assessment of Structure Prediction (
431:) can usually be estimated within 30° for an accurate backbone structure; however, the later dihedral angles found in longer side chains such as 1926:
Wilson, C; Gregoret, LM; Agard, DA. (1993). "Modeling side-chain conformation for homologous proteins using an energy-based rotamer search".
685: 331:
The segment-matching method divides the target into a series of short segments, each of which is matched to its own template fitted from the
80:
The sequence alignment and template structure are then used to produce a structural model of the target. Because protein structures are more
278:
sequence profiles of solved structures; the coarse-graining inherent in the profile construction is thought to reduce noise introduced by
188:
evolutionary relationship is so distant that it cannot be discerned reliably. For comparison, the function of a protein is conserved much
131:
of a protein may be difficult to predict from homology models of its subunit(s). Nevertheless, homology models can be useful in reaching
319:
they deal with regions that are not conserved or that lack a template. The variable regions are often constructed with the help of a
227: 124: 588: 192:
than the protein sequence, since relatively few changes in amino-acid sequence are required to take on a related function.
680: 360: 359:. One or more target-template alignments are used to construct a set of geometrical criteria that are then converted to 617: 223: 533:
Attempts have been made to improve the accuracy of homology models built with existing methods by subjecting them to
2196: 559: 100: 84:
than DNA sequences, and detectable levels of sequence similarity usually imply significant structural similarity.
1217:
Rychlewski, L; Zhang, B; Godzik, A. (1998). "Fold and function predictions for Mycoplasma genitalium proteins".
2191: 667: 320: 1877:
Kryshtafovych A, Venclovas C, Fidelis K, Moult J. (2005). Progress over the first decade of CASP experiments.
820: 526:
that the major impediment to quality model production is inadequacies in sequence alignment, since "optimal"
2186: 538: 399:
basis of the initial structural fit. The most commonly used software in spatial restraint-based modeling is
395: 303: 81: 205:
providing models of reasonable quality to researchers who are not themselves structure prediction experts.
868:
Chung SY, Subbiah S. (1996.) A structural explanation for the twilight zone of protein sequence homology.
1027:
Dalal, S; Balasubramanian, S; Regan, L. (1997). "Protein alchemy: changing beta-sheet into alpha-helix".
695: 291: 219: 168: 89: 306:
structural fragments identified in closely related solved structures. For example, a modeling study of
2066:"Homology Modeling and Molecular Dynamics Simulation Studies of an Inward Rectifier Potassium Channel" 1482:
Sali, A; Blundell, TL. (1993). "Comparative protein modelling by satisfaction of spatial restraints".
2134: 2077: 2018: 1707: 1266: 552: 527: 344: 128: 1595:"Comparative protein structure modeling by iterative alignment, model building and model assessment" 621: 583: 376: 364: 315: 261: 140: 294:
for each atom in the protein. Three major classes of model generation methods have been proposed.
1987: 1822: 1778: 1290: 1052: 974: 926: 848: 788: 651: 574: 534: 380: 160: 69: 2162: 2103: 2046: 1979: 1943: 1908: 1860: 1814: 1770: 1735: 1673: 1624: 1575: 1540: 1499: 1464: 1423: 1374: 1339: 1282: 1234: 1199: 1155: 1106: 1044: 1009: 966: 918: 840: 746: 690: 650:
and to propose hypotheses about different ATPases' binding affinity. Used in conjunction with
578: 567: 537:
simulation in an effort to improve their RMSD to the experimental structure. However, current
519: 332: 232: 164: 144: 93: 61: 1194: 1177: 2152: 2142: 2093: 2085: 2036: 2026: 1971: 1935: 1900: 1852: 1806: 1762: 1725: 1715: 1665: 1614: 1606: 1567: 1530: 1491: 1454: 1413: 1405: 1366: 1329: 1321: 1274: 1226: 1189: 1145: 1137: 1096: 1088: 1036: 1001: 992:
Dalal, S; Balasubramanian, S; Regan, L (1997). "Transmuting alpha helices and beta sheets".
958: 908: 832: 780: 736: 728: 600: 391: 368: 356: 88:
gaps in the template that arise from poor resolution in the experimental procedure (usually
383:
energy minimization to iteratively refine the positions of all heavy atoms in the protein.
1077:"RaptorX: Exploiting structure information for protein alignment by statistical inference" 563: 307: 184: 2064:
Capener, CE; Shrivastava, IH; Ranatunga, KM; Forrest, LR; Smith, GR; Sansom, MSP (2000).
302:
The original method of homology modeling relied on the assembly of a complete model from
2138: 2081: 2022: 1711: 1696:"The protein structure prediction problem could be solved using the current PDB library" 1270: 2098: 2065: 1418: 1393: 1334: 1309: 1150: 1125: 1101: 1076: 741: 732: 716: 420: 372: 2089: 2041: 2006: 1904: 1730: 1695: 1619: 1594: 1230: 1005: 658:
channel. Large-scale automated modeling of all identified protein-coding regions in a
64:
and an experimental three-dimensional structure of a related homologous protein (the "
30: 2180: 2157: 2122: 1656:
Blake, JD; Cohen, FE. (2001). "Pairwise sequence alignment below the twilight zone".
1459: 1442: 1370: 836: 416: 279: 111: 1826: 1535: 1518: 978: 930: 852: 792: 1991: 1782: 1753:
Koehl, P; Levitt, M. (1999). "A brighter future for protein structure prediction".
1294: 1056: 643: 477: 336: 1357:
Greer, J. (1981). "Comparative model-building of the mammalian serine proteases".
1257:
Baker, D; Sali, A (2001). "Protein structure prediction and structural genomics".
17: 415:
Regions of the target sequence that are not aligned to a template are modeled by
765: 633: 448: 172: 120: 35: 2123:"Large-scale protein structure modeling of the Saccharomyces cerevisiae genome" 1856: 1843:
Ginalski, K. (2006). "Comparative modeling for protein structure prediction".
1571: 1325: 784: 503: 274: 115: 2147: 819:
Marti-Renom, MA; Stuart, AC; Fiser, A; Sanchez, R; Melo, F; Sali, A. (2000).
1720: 1278: 766:"Why similar protein sequences encode similar three-dimensional structures?" 655: 489: 183:, especially since the protein is usually under the constraint that it must 180: 2107: 2050: 2031: 1983: 1939: 1864: 1818: 1774: 1739: 1677: 1669: 1628: 1579: 1544: 1495: 1427: 1343: 1286: 1203: 1159: 1110: 970: 922: 844: 717:"The relation between the divergence of sequence and structure in proteins" 226:– of which PSI-BLAST is the most common example – iteratively update their 2166: 1947: 1912: 1503: 1468: 1378: 1238: 1048: 1013: 750: 110:
Regions of the model that were constructed without a template, usually by
1610: 1443:"Accurate modeling of protein conformation by automatic segment matching" 1409: 515: 436: 400: 390:
solution. A more recent expansion applies the spatial-restraint model to
159:
The method of homology modeling is based on the observation that protein
114:, are generally much less accurate than the rest of the model. Errors in 104: 97: 1394:"All are not equal: A benchmark of different homology modeling programs" 1040: 556: 1810: 1141: 1092: 962: 595: 591:
techniques, although this approach has met with only isolated success.
507: 404: 387: 240: 57: 92:) used to solve the structure. Model quality declines with decreasing 52:
of protein, refers to constructing an atomic-resolution model of the "
659: 647: 640: 636:, tend to be more highly conserved and thus more accurately modeled. 629: 485: 432: 340: 311: 249: 514:
between the modeled and the experimental structure falling around 1
1975: 913: 896: 663: 493:
obtained by a non-expert user employing publicly available tools.
215: 103:
between the matched C atoms at 70% sequence identity but only 2–4
39: 1766: 439:
are notoriously difficult to predict. Moreover, small errors in χ
239:
experiments indicate that some protein threading methods such as
27:
Method of protein structure prediction using other known proteins
625: 481: 236: 148: 2007:"Homology modeling of the cation binding sites of Na+K+-ATPase" 821:"Comparative protein structure modeling of genes and genomes" 1519:"ModLoop: automated modeling of loops in protein structures" 407:
has been established for reliable models generated with it.
573:
Serious local errors can arise in homology models where an
363:
for each restraint. Restraints applied to the main protein
1310:"Progress and challenges in protein structure prediction" 1891:
Vasquez, M. (1996). "Modeling side-chain conformation".
119:
purposes that require atomic-resolution data, such as
347:
of the divergent atoms between target and template.
1126:"a multiple-template approach to protein threading" 34:Homology model of the DHRS7B protein created with 1176:Muckstein, U; Hofacker, IL; Stadler, PF (2002). 599:state as a result of energetic factors in the 8: 1171: 1169: 1838: 1836: 944: 942: 940: 897:"Creating a structural genomics consortium" 1070: 1068: 1066: 2156: 2146: 2097: 2040: 2030: 1729: 1719: 1618: 1534: 1458: 1417: 1333: 1193: 1149: 1100: 912: 764:Kaczanowski, S; Zielenkiewicz, P (2010). 740: 282:in nonessential regions of the sequence. 209:Template selection and sequence alignment 1689: 1687: 1252: 1250: 1248: 29: 864: 862: 814: 812: 810: 808: 806: 804: 802: 707: 1195:10.1093/bioinformatics/18.suppl_2.S153 614:protein–protein interaction prediction 612:Uses of the structural models include 686:Protein structure prediction software 7: 1124:Peng, Jian; Jinbo Xu (April 2011). 733:10.1002/j.1460-2075.1986.tb04288.x 351:Satisfaction of spatial restraints 252:. However, a template with a poor 222:. More sensitive methods based on 25: 2005:Ogawa, H; Toyoshima, C. (2002). 1392:Wallner, B; Elofsson, A (2005). 1178:"Stochastic pairwise alignments" 837:10.1146/annurev.biophys.29.1.291 264:. Perhaps most importantly, the 228:position-specific scoring matrix 1694:Zhang, Y; Skolnick, J. (2005). 624:, and functional annotation of 379:procedure that originally used 825:Annu Rev Biophys Biomol Struct 773:Theoretical Chemistry Accounts 589:ab initio structure prediction 443:(and, to a lesser extent, in χ 1: 2121:Sánchez, R; Sali, A. (1998). 2090:10.1016/S0006-3495(00)76833-0 1905:10.1016/S0959-440X(96)80077-7 1536:10.1093/bioinformatics/btg362 1231:10.1016/S1359-0278(98)00034-0 1075:Peng, Jian; Jinbo Xu (2011). 1006:10.1016/s1359-0278(97)00036-9 715:Chothia, C; Lesk, AM (1986). 361:probability density functions 1460:10.1016/0022-2836(92)90964-L 1371:10.1016/0022-2836(81)90465-4 1359:Journal of Molecular Biology 681:Protein structure prediction 628:identified in an organism's 1517:Fiser, A; Sali, A. (2003). 662:has been attempted for the 375:– serve as the basis for a 343:conflicts arising from the 339:coordinates, and predicted 224:multiple sequence alignment 125:protein–protein interaction 96:; a typical model has ~1–2 2213: 1700:Proc. Natl. Acad. Sci. USA 1642:and associated resources. 1593:John, B; Sali, A. (2003). 101:root mean square deviation 1857:10.1016/j.sbi.2006.02.003 1572:10.1016/j.jmb.2006.01.062 1326:10.1016/j.sbi.2008.02.004 885:49.3 (2020): 517-525. APA 785:10.1007/s00214-009-0656-3 196:Steps in model production 163:is better conserved than 2148:10.1073/pnas.95.23.13597 668:Saccharomyces cerevisiae 321:protein fragment library 179:is unlikely to occur in 1721:10.1073/pnas.0407152101 1279:10.1126/science.1065659 618:protein–protein docking 396:cryoelectron microscopy 2127:Proc Natl Acad Sci USA 2032:10.1073/pnas.202622299 2011:Proc Natl Acad Sci USA 1940:10.1006/jmbi.1993.1100 1670:10.1006/jmbi.2001.4495 1496:10.1006/jmbi.1993.1626 895:Williamson AR (2000). 510:state, and an overall 403:and a database called 127:predictions; even the 42: 1893:Curr Opin Struct Biol 1845:Curr Opin Struct Biol 1314:Curr Opin Struct Biol 696:Molecular replacement 528:structural alignments 292:Cartesian coordinates 169:X-ray crystallography 90:X-ray crystallography 33: 1646:32, D217-D222, 2004. 1410:10.1110/ps.041253405 1188:(Suppl 2): S153–60. 1087:(Suppl 10): 161–71. 553:structural alignment 476:Several large-scale 365:internal coordinates 262:secondary structures 129:quaternary structure 50:comparative modeling 2139:1998PNAS...9513597S 2133:(23): 13597–13602. 2082:2000BpJ....78.2929C 2023:2002PNAS...9915977O 2017:(25): 15977–15982. 1712:2005PNAS..102.1029Z 1441:Levitt, M. (1992). 1271:2001Sci...294...93B 1041:10.1038/nsb0797-548 377:global optimization 345:van der Waals radii 165:amino acid sequence 141:structural genomics 62:amino acid sequence 1811:10.1002/prot.10105 1611:10.1093/nar/gkg460 1142:10.1002/prot.23016 1093:10.1002/prot.23175 963:10.1002/prot.20725 652:molecular dynamics 562:2007-05-31 at the 535:molecular dynamics 394:maps derived from 381:conjugate gradient 161:tertiary structure 70:sequence alignment 43: 38:and rendered with 18:Homology modelling 2197:Protein structure 1644:Nucleic Acids Res 1599:Nucleic Acids Res 691:Protein threading 622:molecular docking 333:Protein Data Bank 298:Fragment assembly 233:Protein threading 145:sequence identity 94:sequence identity 46:Homology modeling 16:(Redirected from 2204: 2171: 2170: 2160: 2150: 2118: 2112: 2111: 2101: 2076:(6): 2929–2942. 2061: 2055: 2054: 2044: 2034: 2002: 1996: 1995: 1958: 1952: 1951: 1923: 1917: 1916: 1888: 1882: 1875: 1869: 1868: 1840: 1831: 1830: 1793: 1787: 1786: 1750: 1744: 1743: 1733: 1723: 1691: 1682: 1681: 1653: 1647: 1639: 1633: 1632: 1622: 1590: 1584: 1583: 1555: 1549: 1548: 1538: 1514: 1508: 1507: 1479: 1473: 1472: 1462: 1438: 1432: 1431: 1421: 1404:(5): 1315–1327. 1389: 1383: 1382: 1354: 1348: 1347: 1337: 1308:Zhang Y (2008). 1305: 1299: 1298: 1254: 1243: 1242: 1214: 1208: 1207: 1197: 1173: 1164: 1163: 1153: 1136:(6): 1930–1939. 1121: 1115: 1114: 1104: 1072: 1061: 1060: 1024: 1018: 1017: 989: 983: 982: 946: 935: 934: 916: 907:(S1(11s)): 953. 892: 886: 883:Sains Malaysiana 879: 873: 866: 857: 856: 816: 797: 796: 770: 761: 755: 754: 744: 712: 601:hydrophobic core 546:Sources of error 520:fold recognition 455:Model assessment 392:electron density 369:protein backbone 357:NMR spectroscopy 327:Segment matching 308:serine proteases 286:Model generation 48:, also known as 21: 2212: 2211: 2207: 2206: 2205: 2203: 2202: 2201: 2192:Protein methods 2177: 2176: 2175: 2174: 2120: 2119: 2115: 2063: 2062: 2058: 2004: 2003: 1999: 1960: 1959: 1955: 1934:(4): 996–1006. 1925: 1924: 1920: 1890: 1889: 1885: 1876: 1872: 1842: 1841: 1834: 1795: 1794: 1790: 1755:Nat Struct Biol 1752: 1751: 1747: 1693: 1692: 1685: 1655: 1654: 1650: 1640: 1636: 1605:(14): 3982–92. 1592: 1591: 1587: 1557: 1556: 1552: 1516: 1515: 1511: 1481: 1480: 1476: 1440: 1439: 1435: 1398:Protein Science 1391: 1390: 1386: 1356: 1355: 1351: 1307: 1306: 1302: 1265:(5540): 93–96. 1256: 1255: 1246: 1216: 1215: 1211: 1175: 1174: 1167: 1123: 1122: 1118: 1074: 1073: 1064: 1029:Nat Struct Biol 1026: 1025: 1021: 991: 990: 986: 948: 947: 938: 901:Nat Struct Biol 894: 893: 889: 880: 876: 867: 860: 818: 817: 800: 779:(3–6): 643–50. 768: 763: 762: 758: 714: 713: 709: 704: 677: 610: 564:Wayback Machine 548: 499: 474: 457: 446: 442: 430: 426: 421:dihedral angles 413: 373:dihedral angles 353: 329: 300: 288: 211: 198: 157: 28: 23: 22: 15: 12: 11: 5: 2210: 2208: 2200: 2199: 2194: 2189: 2187:Bioinformatics 2179: 2178: 2173: 2172: 2113: 2056: 1997: 1953: 1918: 1883: 1881:61(S7):225–36. 1870: 1832: 1805:(4): 593–604. 1788: 1745: 1706:(4): 1029–34. 1683: 1648: 1634: 1585: 1566:(5): 1655–68. 1550: 1529:(18): 2500–1. 1523:Bioinformatics 1509: 1490:(3): 779–815. 1474: 1433: 1384: 1365:(4): 1027–42. 1349: 1320:(3): 342–348. 1300: 1244: 1209: 1182:Bioinformatics 1165: 1116: 1062: 1019: 984: 957:(S7): 99–105. 936: 887: 874: 858: 798: 756: 706: 705: 703: 700: 699: 698: 693: 688: 683: 676: 673: 609: 606: 547: 544: 498: 495: 473: 470: 456: 453: 444: 440: 428: 424: 412: 409: 371:distances and 352: 349: 328: 325: 299: 296: 287: 284: 280:sequence drift 275:stochastically 210: 207: 197: 194: 156: 153: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 2209: 2198: 2195: 2193: 2190: 2188: 2185: 2184: 2182: 2168: 2164: 2159: 2154: 2149: 2144: 2140: 2136: 2132: 2128: 2124: 2117: 2114: 2109: 2105: 2100: 2095: 2091: 2087: 2083: 2079: 2075: 2071: 2067: 2060: 2057: 2052: 2048: 2043: 2038: 2033: 2028: 2024: 2020: 2016: 2012: 2008: 2001: 1998: 1993: 1989: 1985: 1981: 1977: 1976:10.1038/85922 1973: 1970:(3): 337–40. 1969: 1965: 1957: 1954: 1949: 1945: 1941: 1937: 1933: 1929: 1922: 1919: 1914: 1910: 1906: 1902: 1899:(2): 217–21. 1898: 1894: 1887: 1884: 1880: 1874: 1871: 1866: 1862: 1858: 1854: 1850: 1846: 1839: 1837: 1833: 1828: 1824: 1820: 1816: 1812: 1808: 1804: 1800: 1792: 1789: 1784: 1780: 1776: 1772: 1768: 1764: 1761:(2): 108–11. 1760: 1756: 1749: 1746: 1741: 1737: 1732: 1727: 1722: 1717: 1713: 1709: 1705: 1701: 1697: 1690: 1688: 1684: 1679: 1675: 1671: 1667: 1664:(2): 721–35. 1663: 1659: 1652: 1649: 1645: 1638: 1635: 1630: 1626: 1621: 1616: 1612: 1608: 1604: 1600: 1596: 1589: 1586: 1581: 1577: 1573: 1569: 1565: 1561: 1554: 1551: 1546: 1542: 1537: 1532: 1528: 1524: 1520: 1513: 1510: 1505: 1501: 1497: 1493: 1489: 1485: 1478: 1475: 1470: 1466: 1461: 1456: 1453:(2): 507–33. 1452: 1448: 1444: 1437: 1434: 1429: 1425: 1420: 1415: 1411: 1407: 1403: 1399: 1395: 1388: 1385: 1380: 1376: 1372: 1368: 1364: 1360: 1353: 1350: 1345: 1341: 1336: 1331: 1327: 1323: 1319: 1315: 1311: 1304: 1301: 1296: 1292: 1288: 1284: 1280: 1276: 1272: 1268: 1264: 1260: 1253: 1251: 1249: 1245: 1240: 1236: 1232: 1228: 1225:(4): 229–38. 1224: 1220: 1213: 1210: 1205: 1201: 1196: 1191: 1187: 1183: 1179: 1172: 1170: 1166: 1161: 1157: 1152: 1147: 1143: 1139: 1135: 1131: 1127: 1120: 1117: 1112: 1108: 1103: 1098: 1094: 1090: 1086: 1082: 1078: 1071: 1069: 1067: 1063: 1058: 1054: 1050: 1046: 1042: 1038: 1035:(7): 548–52. 1034: 1030: 1023: 1020: 1015: 1011: 1007: 1003: 999: 995: 988: 985: 980: 976: 972: 968: 964: 960: 956: 952: 945: 943: 941: 937: 932: 928: 924: 920: 915: 914:10.1038/80726 910: 906: 902: 898: 891: 888: 884: 878: 875: 871: 865: 863: 859: 854: 850: 846: 842: 838: 834: 830: 826: 822: 815: 813: 811: 809: 807: 805: 803: 799: 794: 790: 786: 782: 778: 774: 767: 760: 757: 752: 748: 743: 738: 734: 730: 726: 722: 718: 711: 708: 701: 697: 694: 692: 689: 687: 684: 682: 679: 678: 674: 672: 670: 669: 665: 661: 657: 653: 649: 645: 644:binding sites 642: 637: 635: 631: 627: 623: 619: 615: 607: 605: 602: 597: 592: 590: 585: 580: 576: 571: 569: 565: 561: 558: 554: 545: 543: 540: 536: 531: 529: 523: 521: 517: 513: 509: 505: 496: 494: 491: 487: 483: 479: 471: 469: 465: 461: 454: 452: 450: 438: 434: 422: 418: 417:loop modeling 411:Loop modeling 410: 408: 406: 402: 397: 393: 389: 384: 382: 378: 374: 370: 366: 362: 358: 350: 348: 346: 342: 338: 334: 326: 324: 322: 317: 313: 309: 305: 297: 295: 293: 285: 283: 281: 276: 270: 267: 263: 257: 255: 251: 247: 242: 238: 234: 229: 225: 221: 217: 208: 206: 202: 195: 193: 191: 186: 182: 176: 174: 170: 166: 162: 154: 152: 150: 146: 142: 137: 134: 130: 126: 122: 117: 113: 112:loop modeling 108: 106: 102: 99: 95: 91: 85: 83: 78: 74: 71: 67: 63: 59: 55: 51: 47: 41: 37: 32: 19: 2130: 2126: 2116: 2073: 2069: 2059: 2014: 2010: 2000: 1967: 1963: 1956: 1931: 1927: 1921: 1896: 1892: 1886: 1878: 1873: 1851:(2): 172–7. 1848: 1844: 1802: 1798: 1791: 1767:10.1038/5794 1758: 1754: 1748: 1703: 1699: 1661: 1657: 1651: 1643: 1637: 1602: 1598: 1588: 1563: 1559: 1553: 1526: 1522: 1512: 1487: 1483: 1477: 1450: 1446: 1436: 1401: 1397: 1387: 1362: 1358: 1352: 1317: 1313: 1303: 1262: 1258: 1222: 1218: 1212: 1185: 1181: 1133: 1129: 1119: 1084: 1080: 1032: 1028: 1022: 1000:(5): R71–9. 997: 993: 987: 954: 950: 904: 900: 890: 882: 877: 869: 828: 824: 776: 772: 759: 727:(4): 823–6. 724: 720: 710: 666: 646:on the Na/K 638: 611: 593: 572: 549: 532: 524: 506:packing and 500: 478:benchmarking 475: 472:Benchmarking 466: 462: 458: 414: 385: 354: 337:alpha carbon 330: 301: 289: 271: 265: 258: 253: 245: 212: 203: 199: 189: 177: 158: 138: 132: 109: 86: 79: 75: 65: 53: 49: 45: 44: 872:4: 1123–27. 831:: 291–325. 634:active site 539:force field 449:active site 173:protein NMR 133:qualitative 121:drug design 36:Swiss-model 2181:Categories 1928:J Mol Biol 1658:J Mol Biol 1560:J Mol Biol 1484:J Mol Biol 1447:J Mol Biol 702:References 504:side chain 116:side chain 2070:Biophys J 1964:Nat Genet 870:Structure 656:potassium 596:rotameric 575:insertion 557:PDBREPORT 522:methods. 508:rotameric 490:LiveBench 304:conserved 181:evolution 82:conserved 60:from its 2108:10827973 2051:12461183 1984:11242120 1879:Proteins 1865:16510277 1827:11280977 1819:12211026 1799:Proteins 1775:10048917 1740:15653774 1678:11254392 1629:12853614 1580:16490207 1545:14668246 1428:15840834 1344:18436442 1287:11588250 1219:Fold Des 1204:12385998 1160:21465564 1130:Proteins 1111:21987485 1081:Proteins 994:Fold Des 979:45345271 971:16187350 951:Proteins 931:35185565 923:11103997 853:11498685 845:10940251 793:95593331 675:See also 579:deletion 560:Archived 497:Accuracy 437:arginine 401:MODELLER 266:coverage 66:template 2167:9811845 2135:Bibcode 2099:1300878 2078:Bibcode 2019:Bibcode 1992:2144435 1948:8445659 1913:8728654 1783:3162636 1708:Bibcode 1504:8254673 1469:1640463 1419:2253266 1379:7045378 1335:2680823 1295:7193705 1267:Bibcode 1259:Science 1239:9710568 1151:3092796 1102:3226909 1057:5608132 1049:9228947 1014:9377709 751:3709526 742:1166865 608:Utility 405:ModBase 388:aqueous 312:mammals 241:RaptorX 58:protein 2165:  2155:  2106:  2096:  2049:  2042:138550 2039:  1990:  1982:  1946:  1911:  1863:  1825:  1817:  1781:  1773:  1738:  1731:545829 1728:  1676:  1627:  1620:165975 1617:  1578:  1543:  1502:  1467:  1426:  1416:  1377:  1342:  1332:  1293:  1285:  1237:  1202:  1158:  1148:  1109:  1099:  1055:  1047:  1012:  977:  969:  929:  921:  851:  843:  791:  749:  739:  721:EMBO J 660:genome 648:ATPase 641:cation 630:genome 486:CAFASP 433:lysine 341:steric 250:operon 155:Motive 54:target 2158:24864 1988:S2CID 1823:S2CID 1779:S2CID 1291:S2CID 1053:S2CID 975:S2CID 927:S2CID 849:S2CID 789:S2CID 769:(PDF) 664:yeast 626:genes 584:loops 427:and χ 316:loops 220:BLAST 216:FASTA 40:PyMOL 2163:PMID 2104:PMID 2047:PMID 1980:PMID 1944:PMID 1909:PMID 1861:PMID 1815:PMID 1771:PMID 1736:PMID 1674:PMID 1625:PMID 1576:PMID 1541:PMID 1500:PMID 1465:PMID 1424:PMID 1375:PMID 1340:PMID 1283:PMID 1235:PMID 1200:PMID 1156:PMID 1107:PMID 1045:PMID 1010:PMID 967:PMID 919:PMID 841:PMID 747:PMID 594:The 512:RMSD 482:CASP 435:and 237:CASP 218:and 190:less 185:fold 171:and 149:CASP 123:and 2153:PMC 2143:doi 2094:PMC 2086:doi 2037:PMC 2027:doi 1972:doi 1936:doi 1932:229 1901:doi 1853:doi 1807:doi 1763:doi 1726:PMC 1716:doi 1704:102 1666:doi 1662:307 1615:PMC 1607:doi 1568:doi 1564:357 1531:doi 1492:doi 1488:234 1455:doi 1451:226 1414:PMC 1406:doi 1367:doi 1363:153 1330:PMC 1322:doi 1275:doi 1263:294 1227:doi 1190:doi 1146:PMC 1138:doi 1097:PMC 1089:doi 1037:doi 1002:doi 959:doi 909:doi 833:doi 781:doi 777:125 737:PMC 729:doi 577:or 568:PDB 310:in 151:). 2183:: 2161:. 2151:. 2141:. 2131:95 2129:. 2125:. 2102:. 2092:. 2084:. 2074:78 2072:. 2068:. 2045:. 2035:. 2025:. 2015:99 2013:. 2009:. 1986:. 1978:. 1968:27 1966:. 1942:. 1930:. 1907:. 1895:. 1859:. 1849:16 1847:. 1835:^ 1821:. 1813:. 1803:48 1801:. 1777:. 1769:. 1757:. 1734:. 1724:. 1714:. 1702:. 1698:. 1686:^ 1672:. 1660:. 1623:. 1613:. 1603:31 1601:. 1597:. 1574:. 1562:. 1539:. 1527:19 1525:. 1521:. 1498:. 1486:. 1463:. 1449:. 1445:. 1422:. 1412:. 1402:14 1400:. 1396:. 1373:. 1361:. 1338:. 1328:. 1318:18 1316:. 1312:. 1289:. 1281:. 1273:. 1261:. 1247:^ 1233:. 1221:. 1198:. 1186:18 1184:. 1180:. 1168:^ 1154:. 1144:. 1134:79 1132:. 1128:. 1105:. 1095:. 1085:79 1083:. 1079:. 1065:^ 1051:. 1043:. 1031:. 1008:. 996:. 973:. 965:. 955:61 953:. 939:^ 925:. 917:. 903:. 899:. 861:^ 847:. 839:. 829:29 827:. 823:. 801:^ 787:. 775:. 771:. 745:. 735:. 723:. 719:. 620:, 616:, 570:. 451:. 423:(χ 367:– 323:. 56:" 2169:. 2145:: 2137:: 2110:. 2088:: 2080:: 2053:. 2029:: 2021:: 1994:. 1974:: 1950:. 1938:: 1915:. 1903:: 1897:6 1867:. 1855:: 1829:. 1809:: 1785:. 1765:: 1759:6 1742:. 1718:: 1710:: 1680:. 1668:: 1631:. 1609:: 1582:. 1570:: 1547:. 1533:: 1506:. 1494:: 1471:. 1457:: 1430:. 1408:: 1381:. 1369:: 1346:. 1324:: 1297:. 1277:: 1269:: 1241:. 1229:: 1223:3 1206:. 1192:: 1162:. 1140:: 1113:. 1091:: 1059:. 1039:: 1033:4 1016:. 1004:: 998:2 981:. 961:: 933:. 911:: 905:7 855:. 835:: 795:. 783:: 753:. 731:: 725:5 516:Ă… 445:2 441:1 429:2 425:1 254:E 246:E 105:Ă… 98:Ă… 20:)

Index

Homology modelling

Swiss-model
PyMOL
protein
amino acid sequence
sequence alignment
conserved
X-ray crystallography
sequence identity
Ă…
root mean square deviation
Ă…
loop modeling
side chain
drug design
protein–protein interaction
quaternary structure
structural genomics
sequence identity
CASP
tertiary structure
amino acid sequence
X-ray crystallography
protein NMR
evolution
fold
FASTA
BLAST
multiple sequence alignment

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑