%0 Journal Article %J Cell Cycle (Georgetown, Tex.) %D 2008 %T Uncharacterized DUF1574 leptospira proteins are SGNH hydrolases %A Lukasz Knizewski %A Kamil Steczkiewicz %A Krzysztof Kuchta %A Lucjan Wyrwicz %A Dariusz Plewczynski %A Andrzej Koliński %A Leszek Rychlewski %A Krzysztof Ginalski %K Amino Acid Sequence %K Bacterial Proteins %K Bacterial Proteins: genetics %K Base Sequence %K Computational Biology %K DNA %K Hydrolases %K Hydrolases: genetics %K Leptospira %K Leptospira: enzymology %K Models %K Molecular %K Molecular Sequence Data %K Sequence Alignment %K Sequence Analysis %B Cell Cycle (Georgetown, Tex.) %V 7 %P 542–4 %8 feb %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/18235229 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2007 %T Comparative modeling without implicit sequence alignments %A Andrzej Koliński %A Dominik Gront %K Algorithms %K Amino Acid Sequence %K Chemical %K Computer Simulation %K Models %K Molecular %K Molecular Sequence Data %K Protein %K Protein Conformation %K Protein: methods %K Proteins %K Proteins: chemistry %K Proteins: ultrastructure %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Analysis %X
MOTIVATION: The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). RESULTS: The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.
%B Bioinformatics (Oxford, England) %V 23 %P 2522–7 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/17660201 %R 10.1093/bioinformatics/btm380 %0 Journal Article %J Proteins: Structure, Function, Bioinformatics %D 2007 %T Ideal amino acid exchange forms for approximating substitution matrices %A Piotr Pokarowski %A Andrzej Kloczkowski %A Szymon Nowakowski %A Maria Pokarowska %A Robert L. Jernigan %A Andrzej Koliński %K protein contact potentials %K protein structure prediction %K Sequence Alignment %K substitution matrices %X We have analyzed 29 published substitution matrices (SMs) and five statistical protein contact potentials (CPs) for comparison. We find that popular, ‘classical’ SMs obtained mainly from sequence alignments of globular proteins are mostly correlated by at least a value of 0.9. The BLOSUM62 is the central element of this group. A second group includes SMs derived from alignments of remote homologs or transmembrane proteins. These matrices correlate better with classical SMs (0.8) than among themselves (0.7). A third group consists of intermediate links between SMs and CPs - matrices and potentials that exhibit mutual correlations of at least 0.8. Next, we show that SMs can be approximated with a correlation of 0.9 by expressions c0 + xixj + yiyj + zizj, 1≤ i, j ≤ 20, where c0 is a constant and the vectors (xi), (yi), (zi) correlate highly with hydrophobicity, molecular volume and coil preferences of amino acids, respectively. The present paper is the continuation of our work (Pokarowski et al., Proteins 2005;59:49–57), where similar approximation were used to derive ideal amino acid interaction forms from CPs. Both approximations allow us to understand general trends in amino acid similarity and can help improve multiple sequence alignments using the fast Fourier transform (MAFFT), fast threading or another methods based on alignments of physicochemical profiles of protein sequences. The use of this approximation in sequence alignments instead of a classical SM yields results that differ by less than 5%. Intermediate links between SMs and CPs, new formulas for approximating these matrices, and the highly significant dependence of classical SMs on coil preferences are new findings. %B Proteins: Structure, Function, Bioinformatics %V 69 %P 379–393 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/prot.21509/full %R 10.1002/prot %0 Journal Article %J BMC Structural Biology %D 2007 %T Type II restriction endonuclease R.Eco29kI is a member of the GIY-YIG nuclease superfamily %A Elena M. Ibryashkina %A Marina V. Zakharova %A Vladimir B. Baskunov %A Ekaterina S. Bogdanova %A Maxim O. Nagornykh %A Marat M Den'mukhamedov %A Bogdan S. Melnik %A Andrzej Koliński %A Dominik Gront %A Marcin Feder %A Alexander S. Solonin %A Janusz M. Bujnicki %K Amino Acid Sequence %K Binding Sites %K Computational Biology %K Computational Biology: methods %K Deoxyribonucleases %K DNA %K DNA Cleavage %K DNA: metabolism %K Electrophoretic Mobility Shift Assay %K Models %K Molecular %K Molecular Sequence Data %K Mutation %K Protein %K Protein Conformation %K Sequence Alignment %K Structural Homology %K Type II Site-Specific %K Type II Site-Specific: chemist %K Type II Site-Specific: metabol %X BACKGROUND: The majority of experimentally determined crystal structures of Type II restriction endonucleases (REases) exhibit a common PD-(D/E)XK fold. Crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI), and bioinformatics analyses supported by mutagenesis suggested that some REases belong to the HNH fold. Our previous bioinformatic analysis suggested that REase R.Eco29kI shares sequence similarities with one more unrelated nuclease superfamily, GIY-YIG, however so far no experimental data were available to support this prediction. The determination of a crystal structure of the GIY-YIG domain of homing endonuclease I-TevI provided a template for modeling of R.Eco29kI and prompted us to validate the model experimentally. RESULTS: Using protein fold-recognition methods we generated a new alignment between R.Eco29kI and I-TevI, which suggested a reassignment of one of the putative catalytic residues. A theoretical model of R.Eco29kI was constructed to illustrate its predicted three-dimensional fold and organization of the active site, comprising amino acid residues Y49, Y76, R104, H108, E142, and N154. A series of mutants was constructed to generate amino acid substitutions of selected residues (Y49A, R104A, H108F, E142A and N154L) and the mutant proteins were examined for their ability to bind the DNA containing the Eco29kI site 5'-CCGCGG-3' and to catalyze the cleavage reaction. Experimental data reveal that residues Y49, R104, E142, H108, and N154 are important for the nuclease activity of R.Eco29kI, while H108 and N154 are also important for specific DNA binding by this enzyme. CONCLUSION: Substitutions of residues Y49, R104, H108, E142 and N154 predicted by the model to be a part of the active site lead to mutant proteins with strong defects in the REase activity. These results are in very good agreement with the structural model presented in this work and with our prediction that R.Eco29kI belongs to the GIY-YIG superfamily of nucleases. Our study provides the first experimental evidence for a Type IIP REase that does not belong to the PD-(D/E)XK or HNH superfamilies of nucleases, and is instead a member of the unrelated GIY-YIG superfamily. %B BMC Structural Biology %V 7 %P 48 %8 jan %@ 1472680774 %G eng %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1952068&tool=pmcentrez&rendertype=abstract %R 10.1186/1472-6807-7-48 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2006 %T BioShell–a package of tools for structural biology computations %A Dominik Gront %A Andrzej Koliński %K Chemical %K Computational Biology %K Computational Biology: methods %K Computer Simulation %K Databases %K Models %K Protein %K Protein: methods %K Proteins %K Proteins: analysis %K Proteins: chemistry %K Proteins: classification %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Analysis %K Software %XSUMMARY: BioShell is a suite of programs performing common tasks accompanying protein structure modeling. BioShell design is based on UNIX shell flexibility and should be used as its extension. Using BioShell various molecular modeling procedures can be integrated in a single pipeline. AVAILABILITY: BioShell package can be downloaded from its website http://biocomp.chem.uw.edu.pl/BioShell and these pages provide many examples and a detailed documentation for the newest version.
%B Bioinformatics (Oxford, England) %V 22 %P 621–622 %8 mar %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/16407320 %R 10.1093/bioinformatics/btk037 %0 Journal Article %J Journal of Computer-Aided Molecular Design %D 2006 %T Three dimensional model of severe acute respiratory syndrome coronavirus helicase ATPase catalytic domain and molecular design of severe acute respiratory syndrome coronavirus helicase inhibitors %A Marcin Hoffmann %A Krystian Eitner %A Marcin von Grotthuss %A Leszek Rychlewski %A Ewa Banachowicz %A Tomasz Grabarkiewicz %A Tomasz Szkoda %A Andrzej Koliński %K Amino Acid Sequence %K Catalytic Domain %K Conserved Sequence %K DNA Helicases %K DNA Helicases: antagonists & inhibitors %K DNA Helicases: chemistry %K Drug Design %K Enzyme Inhibitors %K Enzyme Inhibitors: pharmacology %K Models %K Molecular %K Molecular Sequence Data %K Protein %K SARS Virus %K SARS Virus: enzymology %K Sequence Alignment %K Structural Homology %K Thermodynamics %X The modeling of the severe acute respiratory syndrome coronavirus helicase ATPase catalytic domain was performed using the protein structure prediction Meta Server and the 3D Jury method for model selection, which resulted in the identification of 1JPR, 1UAA and 1W36 PDB structures as suitable templates for creating a full atom 3D model. This model was further utilized to design small molecules that are expected to block an ATPase catalytic pocket thus inhibit the enzymatic activity. Binding sites for various functional groups were identified in a series of molecular dynamics calculation. Their positions in the catalytic pocket were used as constraints in the Cambridge structural database search for molecules having the pharmacophores that interacted most strongly with the enzyme in a desired position. The subsequent MD simulations followed by calculations of binding energies of the designed molecules were compared to ATP identifying the most successful candidates, for likely inhibitors - molecules possessing two phosphonic acid moieties at distal ends of the molecule. %B Journal of Computer-Aided Molecular Design %V 20 %P 305–319 %8 may %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/16972168 %R 10.1007/s10822-006-9057-z %0 Journal Article %J Proteins %D 2005 %T Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models %A Andrzej Koliński %A Janusz M. Bujnicki %K Algorithms %K Computational Biology %K Computational Biology: methods %K Computer Simulation %K Computers %K Data Interpretation %K Databases %K Dimerization %K Models %K Molecular %K Monte Carlo Method %K Protein %K Protein Conformation %K Protein Folding %K Protein Structure %K Proteomics %K Proteomics: methods %K Reproducibility of Results %K Secondary %K Sequence Alignment %K Software %K Statistical %K Tertiary %X To predict the tertiary structure of full-length sequences of all targets in CASP6, regardless of their potential category (from easy comparative modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in different categories in CASP5. First, the GeneSilico metaserver was used to identify domains, predict secondary structure, and generate fold recognition (FR) alignments, which were converted to full-atom models using the "FRankenstein's Monster" approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated "de novo" by fully automated servers were obtained from the CASP website. All these models were evaluated by VERIFY3D, and residues with scores better than 0.2 were used as a source of spatial restraints. Second, a new implementation of the lattice-based protein modeling tool CABS was used to carry out folding guided by the above-mentioned restraints with the Replica Exchange Monte Carlo sampling technique. Decoys generated in the course of simulation were subject to the average linkage hierarchical clustering. For a representative decoy from each cluster, a full-atom model was rebuilt. Finally, five models were selected for submission based on combination of various criteria, including the size, density, and average energy of the corresponding cluster, and the visual evaluation of the full-atom structures and their relationship to the original templates. The combination of FRankenstein and CABS was one of the best-performing algorithms over all categories in CASP6 (it is important to note that our human intervention was very limited, and all steps in our method can be easily automated). We were able to generate a number of very good models, especially in the Comparative Modeling and New Folds categories. Frequently, the best models were closer to the native structure than any of the templates used. The main problem we encountered was in the ranking of the final models (the only step of significant human intervention), due to the insufficient computational power, which precluded the possibility of full-atom refinement and energy-based evaluation. %B Proteins %V 61 Suppl. 7 %P 84–90 %8 jan %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/16187348 %R 10.1002/prot.20723 %0 Journal Article %J Bioinformatics %D 2005 %T HCPM–program for hierarchical clustering of protein models %A Dominik Gront %A Andrzej Koliński %K Algorithms %K Chemical %K Cluster Analysis %K Computer Simulation %K Internet %K Models %K Molecular %K Protein %K Protein: methods %K Proteins %K Proteins: analysis %K Proteins: chemistry %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Analysis %K Software %K User-Computer Interface %X HCPM is a tool for clustering protein structures from comparative modeling, ab initio structure prediction, etc. A hierarchical clustering algorithm is designed and tested, and a heuristic is provided for an optimal cluster selection. The method has been successfully tested during the CASP6 experiment. %B Bioinformatics %V 21 %P 3179–80 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/15840705 %R 10.1093/bioinformatics/bti450 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2005 %T A new approach to prediction of short-range conformational propensities in proteins %A Dominik Gront %A Andrzej Koliński %K Algorithms %K Amino Acid %K Artificial Intelligence %K Chemical %K Computer Simulation %K Databases %K Gas Chromatography-Mass Spectrometry %K Gas Chromatography-Mass Spectrometry: methods %K Models %K Protein %K Protein Conformation %K Protein: methods %K Proteins %K Proteins: analysis %K Proteins: chemistry %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Analysis %K Sequence Homology %K Structure-Activity Relationship %XMOTIVATION: Knowledge-based potentials are valuable tools for protein structure modeling and evaluation of the quality of the structure prediction obtained by a variety of methods. Potentials of such type could be significantly enhanced by a proper exploitation of the evolutionary information encoded in related protein sequences. The new potentials could be valuable components of threading algorithms, ab-initio protein structure prediction, comparative modeling and structure modeling based on fragmentary experimental data. RESULTS: A new potential for scoring local protein geometry is designed and evaluated. The approach is based on the similarity of short protein fragments measured by an alignment of their sequence profiles. Sequence specificity of the resulting energy function has been compared with the specificity of simpler potentials using gapless threading and the ability to predict specific geometry of protein fragments. Significant improvement in threading sensitivity and in the ability to generate sequence-specific protein-like conformations has been achieved.
%B Bioinformatics (Oxford, England) %V 21 %P 981–987 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/15509604 %R 10.1093/bioinformatics/bti080 %0 Journal Article %J Proteins %D 2001 %T Generalized comparative modeling (GENECOMP): a combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement %A Andrzej Koliński %A Marcos Betancourt %A Daisuke Kihara %A Piotr Rotkiewicz %A Jeffrey Skolnick %K Algorithms %K Chemical %K Combinatorial Chemistry Techniques %K Combinatorial Chemistry Techniques: methods %K Computational Biology %K Computational Biology: methods %K Computer Simulation %K Databases %K Factual %K Models %K Molecular %K Monte Carlo Method %K Protein Folding %K Proteins %K Proteins: chemistry %K Sequence Alignment %K Sequence Alignment: methods %X An improved generalized comparative modeling method, GENECOMP, for the refinement of threading models is developed and validated on the Fischer database of 68 probe-template pairs, a standard benchmark used to evaluate threading approaches. The basic idea is to perform ab initio folding using a lattice protein model, SICHO, near the template provided by the new threading algorithm PROSPECTOR. PROSPECTOR also provides predicted contacts and secondary structure for the template-aligned regions, and possibly for the unaligned regions by garnering additional information from other top-scoring threaded structures. Since the lowest-energy structure generated by the simulations is not necessarily the best structure, we employed two structure-selection protocols: distance geometry and clustering. In general, clustering is found to generate somewhat better quality structures in 38 of 68 cases. When applied to the Fischer database, the protocol does no harm and in a significant number of cases improves upon the initial threading model, sometimes dramatically. The procedure is readily automated and can be implemented on a genomic scale. %B Proteins %V 44 %P 133–149 %8 aug %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/11391776 %0 Journal Article %J Protein Engineering %D 2001 %T Three-dimensional modeling of the I-TevI homing endonuclease catalytic domain, a GIY-YIG superfamily member, using NMR restraints and Monte Carlo dynamics %A Janusz M. Bujnicki %A Piotr Rotkiewicz %A Andrzej Koliński %A Leszek Rychlewski %K Algorithms %K Binding Sites %K Biomolecular %K Endodeoxyribonucleases %K Endodeoxyribonucleases: chemistry %K Models %K Molecular %K Monte Carlo Method %K Nuclear Magnetic Resonance %K Protein Structure %K Sequence Alignment %K Tertiary %X Using a recent version of the SICHO algorithm for in silico protein folding, we made a blind prediction of the tertiary structure of the N-terminal, independently folded, catalytic domain (CD) of the I-TevI homing endonuclease, a representative of the GIY-YIG superfamily of homing endonucleases. The secondary structure of the I-TevI CD has been determined using NMR spectroscopy, but computational sequence analysis failed to detect any protein of known tertiary structure related to the GIY-YIG nucleases (Kowalski et al., Nucleic Acids Res., 1999, 27, 2115-2125). To provide further insight into the structure-function relationships of all GIY-YIG superfamily members, including the recently described subfamily of type II restriction enzymes (Bujnicki et al., Trends Biochem. Sci., 2000, 26, 9-11), we incorporated the experimentally determined and predicted secondary and tertiary restraints in a reduced (side chain only) protein model, which was minimized by Monte Carlo dynamics and simulated annealing. The subsequently elaborated full atomic model of the I-TevI CD allows the available experimental data to be put into a structural context and suggests that the GIY-YIG domain may dimerize in order to bring together the conserved residues of the active site. %B Protein Engineering %V 14 %P 717–721 %8 oct %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/11739889 %0 Journal Article %J Proteins %D 1999 %T A method for the improvement of threading-based protein models %A Andrzej Koliński %A Piotr Rotkiewicz %A Bartosz Ilkowski %A Jeffrey Skolnick %K Amino Acid Sequence %K Computer Simulation %K Evaluation Studies as Topic %K Methods %K Models %K Molecular %K Molecular Sequence Data %K Protein Conformation %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %K Sequence Alignment %K Software Design %X A new method for the homology-based modeling of protein three-dimensional structures is proposed and evaluated. The alignment of a query sequence to a structural template produced by threading algorithms usually produces low-resolution molecular models. The proposed method attempts to improve these models. In the first stage, a high-coordination lattice approximation of the query protein fold is built by suitable tracking of the incomplete alignment of the structural template and connection of the alignment gaps. These initial lattice folds are very similar to the structures resulting from standard molecular modeling protocols. Then, a Monte Carlo simulated annealing procedure is used to refine the initial structure. The process is controlled by the model's internal force field and a set of loosely defined restraints that keep the lattice chain in the vicinity of the template conformation. The internal force field consists of several knowledge-based statistical potentials that are enhanced by a proper analysis of multiple sequence alignments. The template restraints are implemented such that the model chain can slide along the template structure or even ignore a substantial fraction of the initial alignment. The resulting lattice models are, in most cases, closer (sometimes much closer) to the target structure than the initial threading-based models. All atom models could easily be built from the lattice chains. The method is illustrated on 12 examples of target/template pairs whose initial threading alignments are of varying quality. Possible applications of the proposed method for use in protein function annotation are briefly discussed. %B Proteins %V 37 %P 592–610 %8 dec %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/10651275 %0 Journal Article %J Proceedings of the National Academy of Sciences of the United States of America %D 1998 %T Nativelike topology assembly of small proteins using predicted restraints in Monte Carlo folding simulations %A Angel. R. Ortiz %A Andrzej Koliński %A Jeffrey Skolnick %K Algorithms %K Models %K Molecular %K Monte Carlo Method %K Protein Folding %K Protein Structure %K Secondary %K Sequence Alignment %K Software %K Tertiary %X By incorporating predicted secondary and tertiary restraints derived from multiple sequence alignments into ab initio folding simulations, it has been possible to assemble native-like tertiary structures for a test set of 19 nonhomologous proteins ranging from 29 to 100 residues in length and representing all secondary structural classes. Secondary structural restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Multiple sequence alignments also provide predicted tertiary restraints via a two-step process: First, seed side chain contacts are selected from a correlated mutation analysis, and then an inverse folding algorithm expands these seed contacts. The predicted secondary and tertiary restraints are incorporated into a lattice-based, reduced protein model for structure assembly and refinement. The resulting native-like topologies exhibit a coordinate root-mean-square deviation from native for the whole chain between 3.1 and 6.7 A, with values ranging from 2.6 to 4.1 A over approximately 80% of the structure. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm is a promising approach to the prediction of the global topology of small proteins. %B Proceedings of the National Academy of Sciences of the United States of America %V 95 %P 1020–1025 %8 feb %G eng %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=18658&tool=pmcentrez&rendertype=abstract %0 Journal Article %J Proteins %D 1998 %T Tertiary structure prediction of the KIX domain of CBP using Monte Carlo simulations driven by restraints derived from multiple sequence alignments %A Angel. R. Ortiz %A Andrzej Koliński %A Jeffrey Skolnick %K Algorithms %K Amino Acid Sequence %K CREB-Binding Protein %K Databases as Topic %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Mutation %K Mutation: genetics %K Nuclear Proteins %K Nuclear Proteins: chemistry %K Protein Folding %K Protein Structure %K Secondary %K Sequence Alignment %K Tertiary %K Trans-Activators %K Transcription Factors %K Transcription Factors: chemistry %X Using a recently developed protein folding algorithm, a prediction of the tertiary structure of the KIX domain of the CREB binding protein is described. The method incorporates predicted secondary and tertiary restraints derived from multiple sequence alignments in a reduced protein model whose conformational space is explored by Monte Carlo dynamics. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that was modified for the presence of predicted U-turns, i.e., regions where the chain reverses global direction. Tertiary restraints are obtained via a two-step process: First, seed side-chain contacts are identified from a correlated mutation analysis, and then, a threading-based algorithm expands the number of these seed contacts. Blind predictions indicate that the KIX domain is a putative three-helix bundle, although the chirality of the bundle could not be uniquely determined. The expected root-mean-square deviation for the correct chirality of the KIX domain is between 5.0 and 6.2 A. This is to be compared with the estimate of 12.9 A that would be expected by a random prediction, using the model of F. Cohen and M. Sternberg (J. Mol. Biol. 138:321-333, 1980). %B Proteins %V 30 %P 287–294 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9517544 %0 Journal Article %J Proteins %D 1997 %T Improved method for prediction of protein backbone U-turn positions and major secondary structural elements between U-turns %A Wei-Ping Hu %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid %K Amino Acid Sequence %K Amino Acids %K Amino Acids: chemistry %K Data Interpretation %K Models %K Molecular %K Molecular Sequence Data %K Protein Structure %K Proteins %K Proteins: chemistry %K Reproducibility of Results %K Secondary %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Alignment: statistics & numerical data %K Sequence Homology %K Statistical %X A new and more accurate method has been developed for predicting the backbone U-turn positions (where the chain reverses global direction) and the dominant secondary structure elements between U-turns in globular proteins. The current approach uses sequence-specific secondary structure propensities and multiple sequence information. The latter plays an important role in the enhanced success of this approach. Application to two sets (total 108) of small to medium-sized, single-domain proteins indicates that approximately 94% of the U-turn locations are correctly predicted within three residues, as are 88% of dominant secondary structure elements. These results are significantly better than our previous method (Kolinski et al., Proteins 27:290-308, 1997). The current study strongly suggests that the U-turn locations are primarily determined by local interactions. Furthermore, both global length constraints and local interactions contribute significantly to the determination of the secondary structure types between U-turns. Accurate U-turn predictions are crucial for accurate secondary structure predictions in the current method. Protein structure modeling, tertiary structure predictions, and possibly, fold recognition should benefit from the predicted structural data provided by this new method. %B Proteins %V 29 %P 443–460 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9408942