%0 Journal Article %J Biophysical Journal %D 2003 %T TOUCHSTONE II: a new approach to ab initio protein structure prediction %A Yang Zhang %A Andrzej Koliński %A Jeffrey Skolnick %K Algorithms %K Amino Acid Sequence %K Computer Simulation %K Crystallography %K Crystallography: methods %K Energy Transfer %K Models %K Molecular %K Molecular Sequence Data %K Protein %K Protein Conformation %K Protein Folding %K Protein Structure %K Protein: methods %K Proteins %K Proteins: chemistry %K Secondary %K Sequence Analysis %K Software %K Static Electricity %K Statistical %X We have developed a new combined approach for ab initio protein structure prediction. The protein conformation is described as a lattice chain connecting C(alpha) atoms, with attached C(beta) atoms and side-chain centers of mass. The model force field includes various short-range and long-range knowledge-based potentials derived from a statistical analysis of the regularities of protein structures. The combination of these energy terms is optimized through the maximization of correlation for 30 x 60,000 decoys between the root mean square deviation (RMSD) to native and energies, as well as the energy gap between native and the decoy ensemble. To accelerate the conformational search, a newly developed parallel hyperbolic sampling algorithm with a composite movement set is used in the Monte Carlo simulation processes. We exploit this strategy to successfully fold 41/100 small proteins (36 approximately 120 residues) with predicted structures having a RMSD from native below 6.5 A in the top five cluster centroids. To fold larger-size proteins as well as to improve the folding yield of small proteins, we incorporate into the basic force field side-chain contact predictions from our threading program PROSPECTOR where homologous proteins were excluded from the data base. With these threading-based restraints, the program can fold 83/125 test proteins (36 approximately 174 residues) with structures having a RMSD to native below 6.5 A in the top five cluster centroids. This shows the significant improvement of folding by using predicted tertiary restraints, especially when the accuracy of side-chain contact prediction is >20%. For native fold selection, we introduce quantities dependent on the cluster density and the combination of energy and free energy, which show a higher discriminative power to select the native structure than the previously used cluster energy or cluster size, and which can be used in native structure identification in blind simulations. These procedures are readily automated and are being implemented on a genomic scale. %B Biophysical Journal %V 85 %P 1145–64 %G eng %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1303233&tool=pmcentrez&rendertype=abstract %R 10.1016/S0006-3495(03)74551-2 %0 Journal Article %J Proteins %D 2000 %T Computer simulations of the properties of the alpha2, alpha2C, and alpha2D de novo designed helical proteins %A Andrzej Sikorski %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Computer Simulation %K Drug Design %K Molecular Sequence Data %K Protein Folding %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %K Thermodynamics %X Reduced lattice models of the three de novo designed helical proteins alpha2, alpha2C, and alpha2D were studied. Low temperature stable folds were obtained for all three proteins. In all cases, the lowest energy folds were four-helix bundles. The folding pathway is qualitatively the same for all proteins studied. The energies of various topologies are similar, especially for the alpha2 polypeptide. The simulated crossover from molten globule to native-like behavior is very similar to that seen in experimental studies. Simulations on a reduced protein model reproduce most of the experimental properties of the alpha2, alpha2C, and alpha2D proteins. Stable four-helix bundle structures were obtained, with increasing native-like behavior on-going from alpha2 to alpha2D that mimics experiment. %B Proteins %V 38 %P 17–28 %8 jan %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/10651035 %0 Journal Article %J Proteins %D 1999 %T Ab initio folding of proteins using restraints derived from evolutionary information %A Angel. R. Ortiz %A Andrzej Koliński %A Piotr Rotkiewicz %A Bartosz Ilkowski %A Jeffrey Skolnick %K Algorithms %K Amino Acid Sequence %K Evolution %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Protein Folding %K Proteins %K Proteins: chemistry %X We present our predictions in the ab initio structure prediction category of CASP3. Eleven targets were folded, using a method based on a Monte Carlo search driven by secondary and tertiary restraints derived from multiple sequence alignments. Our results can be qualitatively summarized as follows: The global fold can be considered "correct" for targets 65 and 74, "almost correct" for targets 64, 75, and 77, "half-correct" for target 79, and "wrong" for targets 52, 56, 59, and 63. Target 72 has not yet been solved experimentally. On average, for small helical and alpha/beta proteins (on the order of 110 residues or smaller), the method predicted low resolution structures with a reasonably good prediction of the global topology. Most encouraging is that in some situations, such as with target 75 and, particularly, target 77, the method can predict a substantial portion of a rare or even a novel fold. However, the current method still fails on some beta proteins, proteins over the 110-residue threshold, and sequences in which only a poor multiple sequence alignment can be built. On the other hand, for small proteins, the method gives results of quality at least similar to that of threading, with the advantage of not being restricted to known folds in the protein database. Overall, these results indicate that some progress has been made on the ab initio protein folding problem. Detailed information about our results can be obtained by connecting to http:/(/)www.bioinformatics.danforthcenter.org/+ ++CASP3. %B Proteins %V Suppl. 3 %P 177–185 %8 jan %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/10526366 %0 Journal Article %J Biophysical Journal %D 1999 %T Dynamics and thermodynamics of beta-hairpin assembly: insights from various simulation techniques %A Andrzej Koliński %A Bartosz Ilkowski %A Jeffrey Skolnick %K Amino Acid Sequence %K Animals %K Biophysical Phenomena %K Biophysics %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Nerve Tissue Proteins %K Nerve Tissue Proteins: chemistry %K Protein Conformation %K Protein Folding %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %K Thermodynamics %X Small peptides that might have some features of globular proteins can provide important insights into the protein folding problem. Two simulation methods, Monte Carlo Dynamics (MCD), based on the Metropolis sampling scheme, and Entropy Sampling Monte Carlo (ESMC), were applied in a study of a high-resolution lattice model of the C-terminal fragment of the B1 domain of protein G. The results provide a detailed description of folding dynamics and thermodynamics and agree with recent experimental findings (. Nature. 390:196-197). In particular, it was found that the folding is cooperative and has features of an all-or-none transition. Hairpin assembly is usually initiated by turn formation; however, hydrophobic collapse, followed by the system rearrangement, was also observed. The denatured state exhibits a substantial amount of fluctuating helical conformations, despite the strong beta-type secondary structure propensities encoded in the sequence. %B Biophysical Journal %V 77 %P 2942–52 %8 dec %G eng %U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1300567&tool=pmcentrez&rendertype=abstract %R 10.1016/S0006-3495(99)77127-4 %0 Journal Article %J Proteins %D 1999 %T A method for the improvement of threading-based protein models %A Andrzej Koliński %A Piotr Rotkiewicz %A Bartosz Ilkowski %A Jeffrey Skolnick %K Amino Acid Sequence %K Computer Simulation %K Evaluation Studies as Topic %K Methods %K Models %K Molecular %K Molecular Sequence Data %K Protein Conformation %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %K Sequence Alignment %K Software Design %X A new method for the homology-based modeling of protein three-dimensional structures is proposed and evaluated. The alignment of a query sequence to a structural template produced by threading algorithms usually produces low-resolution molecular models. The proposed method attempts to improve these models. In the first stage, a high-coordination lattice approximation of the query protein fold is built by suitable tracking of the incomplete alignment of the structural template and connection of the alignment gaps. These initial lattice folds are very similar to the structures resulting from standard molecular modeling protocols. Then, a Monte Carlo simulated annealing procedure is used to refine the initial structure. The process is controlled by the model's internal force field and a set of loosely defined restraints that keep the lattice chain in the vicinity of the template conformation. The internal force field consists of several knowledge-based statistical potentials that are enhanced by a proper analysis of multiple sequence alignments. The template restraints are implemented such that the model chain can slide along the template structure or even ignore a substantial fraction of the initial alignment. The resulting lattice models are, in most cases, closer (sometimes much closer) to the target structure than the initial threading-based models. All atom models could easily be built from the lattice chains. The method is illustrated on 12 examples of target/template pairs whose initial threading alignments are of varying quality. Possible applications of the proposed method for use in protein function annotation are briefly discussed. %B Proteins %V 37 %P 592–610 %8 dec %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/10651275 %0 Journal Article %J Biophysical Journal %D 1998 %T Computer simulations of de novo designed helical proteins %A Andrzej Sikorski %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Biophysical Phenomena %K Biophysics %K Computer Simulation %K Dimerization %K Drug Design %K Hydrogen Bonding %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Protein Conformation %K Protein Folding %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %K Thermodynamics %X In the context of reduced protein models, Monte Carlo simulations of three de novo designed helical proteins (four-member helical bundle) were performed. At low temperatures, for all proteins under consideration, protein-like folds having different topologies were obtained from random starting conformations. These simulations are consistent with experimental evidence indicating that these de novo designed proteins have the features of a molten globule state. The results of Monte Carlo simulations suggest that these molecules adopt four-helix bundle topologies. They also give insight into the possible mechanism of folding and association, which occurs in these simulations by on-site assembly of the helices. The low-temperature conformations of all three sequences have the features of a molten globule state. %B Biophysical Journal %V 75 %P 92–105 %8 jul %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/10651035 %R 10.1016/S0006-3495(98)77497-1 %0 Journal Article %J Journal of Molecular Biology %D 1998 %T Fold assembly of small proteins using monte carlo simulations driven by restraints derived from multiple sequence alignments %A Angel. R. Ortiz %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Chemical %K Models %K Molecular Sequence Data %K Monte Carlo Method %K Protein Folding %K Protein Structure %K Secondary %K Tertiary %X The feasibility of predicting the global fold of small proteins by incorporating predicted secondary and tertiary restraints into ab initio folding simulations has been demonstrated on a test set comprised of 20 non-homologous proteins, of which one was a blind prediction of target 42 in the recent CASP2 contest. These proteins contain from 37 to 100 residues and represent all secondary structural classes and a representative variety of global topologies. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Predicted tertiary restraints are derived from multiple sequence alignments via a two-step process. First, seed side-chain contacts are identified from correlated mutation analysis, and then a threading-based algorithm is used to expand the number of these seed contacts. A lattice-based reduced protein model and a folding algorithm designed to incorporate these predicted restraints is described. Depending upon fold complexity, it is possible to assemble native-like topologies whose coordinate root-mean-square deviation from native is between 3.0 A and 6.5 A. The requisite level of accuracy in side-chain contact map prediction can be roughly 25% on average, provided that about 60% of the contact predictions are correct within +/-1 residue and 95% of the predictions are correct within +/-4 residues. Precision in tertiary contact prediction is more critical than absolute accuracy. Furthermore, only a subset of the tertiary contacts, on the order of 25% of the total, is sufficient for successful topology assembly. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm holds considerable promise for the prediction of the global topology of small proteins. %B Journal of Molecular Biology %V 277 %P 419–448 %8 mar %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9514747 %R 10.1006/jmbi.1997.1595 %0 Journal Article %J Proteins %D 1998 %T Tertiary structure prediction of the KIX domain of CBP using Monte Carlo simulations driven by restraints derived from multiple sequence alignments %A Angel. R. Ortiz %A Andrzej Koliński %A Jeffrey Skolnick %K Algorithms %K Amino Acid Sequence %K CREB-Binding Protein %K Databases as Topic %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Mutation %K Mutation: genetics %K Nuclear Proteins %K Nuclear Proteins: chemistry %K Protein Folding %K Protein Structure %K Secondary %K Sequence Alignment %K Tertiary %K Trans-Activators %K Transcription Factors %K Transcription Factors: chemistry %X Using a recently developed protein folding algorithm, a prediction of the tertiary structure of the KIX domain of the CREB binding protein is described. The method incorporates predicted secondary and tertiary restraints derived from multiple sequence alignments in a reduced protein model whose conformational space is explored by Monte Carlo dynamics. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that was modified for the presence of predicted U-turns, i.e., regions where the chain reverses global direction. Tertiary restraints are obtained via a two-step process: First, seed side-chain contacts are identified from a correlated mutation analysis, and then, a threading-based algorithm expands the number of these seed contacts. Blind predictions indicate that the KIX domain is a putative three-helix bundle, although the chirality of the bundle could not be uniquely determined. The expected root-mean-square deviation for the correct chirality of the KIX domain is between 5.0 and 6.2 A. This is to be compared with the estimate of 12.9 A that would be expected by a random prediction, using the model of F. Cohen and M. Sternberg (J. Mol. Biol. 138:321-333, 1980). %B Proteins %V 30 %P 287–294 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9517544 %0 Journal Article %J Proteins %D 1997 %T Improved method for prediction of protein backbone U-turn positions and major secondary structural elements between U-turns %A Wei-Ping Hu %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid %K Amino Acid Sequence %K Amino Acids %K Amino Acids: chemistry %K Data Interpretation %K Models %K Molecular %K Molecular Sequence Data %K Protein Structure %K Proteins %K Proteins: chemistry %K Reproducibility of Results %K Secondary %K Sequence Alignment %K Sequence Alignment: methods %K Sequence Alignment: statistics & numerical data %K Sequence Homology %K Statistical %X A new and more accurate method has been developed for predicting the backbone U-turn positions (where the chain reverses global direction) and the dominant secondary structure elements between U-turns in globular proteins. The current approach uses sequence-specific secondary structure propensities and multiple sequence information. The latter plays an important role in the enhanced success of this approach. Application to two sets (total 108) of small to medium-sized, single-domain proteins indicates that approximately 94% of the U-turn locations are correctly predicted within three residues, as are 88% of dominant secondary structure elements. These results are significantly better than our previous method (Kolinski et al., Proteins 27:290-308, 1997). The current study strongly suggests that the U-turn locations are primarily determined by local interactions. Furthermore, both global length constraints and local interactions contribute significantly to the determination of the secondary structure types between U-turns. Accurate U-turn predictions are crucial for accurate secondary structure predictions in the current method. Protein structure modeling, tertiary structure predictions, and possibly, fold recognition should benefit from the predicted structural data provided by this new method. %B Proteins %V 29 %P 443–460 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9408942 %0 Journal Article %J Proteins %D 1997 %T A method for the prediction of surface "U"-turns and transglobular connections in small proteins %A Andrzej Koliński %A Jeffrey Skolnick %A Adam Godzik %A Wei-Ping Hu %K Algorithms %K Amino Acid Sequence %K Animals %K Humans %K Molecular Sequence Data %K Protein Folding %K Protein Structure %K Proteins %K Proteins: chemistry %K Secondary %X A simple method for predicting the location of surface loops/turns that change the overall direction of the chain that is, "U" turns, and assigning the dominant secondary structure of the intervening transglobular blocks in small, single-domain globular proteins has been developed. Since the emphasis of the method is on the prediction of the major topological elements that comprise the global structure of the protein rather than on a detailed local secondary structure description, this approach is complementary to standard secondary structure prediction schemes. Consequently, it may be useful in the early stages of tertiary structure prediction when establishment of the structural class and possible folding topologies is of interest. Application to a set of small proteins of known structure indicates a high level of accuracy. The prediction of the approximate location of the surface turns/loops that are responsible for the change in overall chain direction is correct in more than 95% of the cases. The accuracy for the dominant secondary structure assignment for the linear blocks between such surface turns/loops is in the range of 82%. %B Proteins %V 27 %P 290–308 %8 feb %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9061792 %0 Journal Article %J Protein Engineering %D 1996 %T Does a backwardly read protein sequence have a unique native state? %A Krzysztof A. Olszewski %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Computer Simulation %K Models %K Molecular %K Molecular Sequence Data %K Monte Carlo Method %K Protein Conformation %K Protein Engineering %K Protein Folding %K Protein Structure %K Secondary %K Staphylococcal Protein A %K Staphylococcal Protein A: chemistry %K Tertiary %X Amino acid sequences of native proteins are generally not palindromic. Nevertheless, the protein molecule obtained as a result of reading the sequence backwards, i.e. a retro-protein, obviously has the same amino acid composition and the same hydrophobicity profile as the native sequence. The important questions which arise in the context of retro-proteins are: does a retro-protein fold to a well defined native-like structure as natural proteins do and, if the answer is positive, does a retro-protein fold to a structure similar to the native conformation of the original protein? In this work, the fold of retro-protein A, originated from the retro-sequence of the B domain of Staphylococcal protein A, was studied. As a result of lattice model simulations, it is conjectured that the retro-protein A also forms a three-helix bundle structure in solution. It is also predicted that the topology of the retro-protein A three-helix bundle is that of the native protein A, rather than that corresponding to the mirror image of native protein A. Secondary structure elements in the retro-protein do not exactly match their counterparts in the original protein structure; however, the amino acid side chain contract pattern of the hydrophobic core is partly conserved. %B Protein Engineering %V 9 %P 5–14 %8 jan %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/9053902 %0 Journal Article %J Biochemistry %D 1996 %T Method for predicting the state of association of discretized protein models. Application to leucine zippers. %A Michal Vieth %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Leucine Zippers %K Molecular Sequence Data %K Protein Folding %X A method that employs a transfer matrix treatment combined with Monte Carlo sampling has been used to calculate the configurational free energies of folded and unfolded states of lattice models of proteins. The method is successfully applied to study the monomer-dimer equilibria in various coiled coils. For the short coiled coils, GCN4 leucine zipper, and its fragments, Fos and Jun, very good agreement is found with experiment. Experimentally, some subdomains of the GCN4 leucine zipper form stable dimeric structures, suggesting the regions of differential stability in the parent structure. Our calculations suggest that the stabilities of the subdomains are in general different from the values expected simply from the stability of the corresponding fragment in the wild type molecule. Furthermore, parts of the fragments structurally rearrange in some regions with respect to their corresponding wild type positions. Our results suggest for an Asn in the dimerization interface at least a pair of hydrophobic interacting helical turns at each side is required to stabilize the stable coiled coil. Finally, the specificity of heterodimer formation in the Fos-Jun system comes from the relative instability of Fos homodimers, resulting from unfavorable intra- and interhelical interactions in the interfacial coiled coil region. %B Biochemistry %V 35 %P 955–967 %G eng %U http://www.ncbi.nlm.nih.gov/pubmed/8547278 %R 10.1021/bi9520702 %0 Journal Article %J Proteins %D 1994 %T Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme %A Andrzej Koliński %A Jeffrey Skolnick %K Amino Acid Sequence %K Amino Acids %K Computer Simulation %K Hydrogen Bonding %K Models, Chemical %K Models, Molecular %K Models, Theoretical %K Molecular Sequence Data %K Monte Carlo Method %K Protein Folding %K Protein Structure, Tertiary %X A new hierarchical method for the simulation of the protein folding process and the de novo prediction of protein three-dimensional structure is proposed. The reduced representation of the protein alpha-carbon backbone employs lattice discretizations of increasing geometrical resolution and a single ball representation of side chain rotamers. In particular, coarser and finer lattice backbone descriptions are used. The coarser (finer) lattice represents C alpha traces of native proteins with an accuracy of 1.0 (0.7) A rms. Folding is simulated by means of very fast Monte Carlo lattice dynamics. The potential of mean force, predominantly of statistical origin, contains several novel terms that facilitate the cooperative assembly of secondary structure elements and the cooperative packing of the side chains. Particular contributions to the interaction scheme are discussed in detail. In the accompanying paper (Kolinski, A., Skolnick, J. Monte Carlo simulation of protein folding. II. Application to protein A, ROP, and crambin. Proteins 18:353-366, 1994), the method is applied to three small globular proteins. %B Proteins %V 18 %P 338-52 %8 1994 Apr %G eng %N 4 %R 10.1002/prot.340180405