Source:Journal of Chemical Theory and Computation, 13(11):5766-5779, 2017
Coarse-grained modeling of biomolecules has a very important role in molecular biology. In this work we present a novel SURPASS (Single United Residue per Pre-Averaged Secondary Structure fragment) model of proteins that can be an interesting alternative for existing coarse-grained models. The design of the model is unique and strongly supported by the statistical analysis of structural regularities characteristic for protein systems. Coarse-graining of protein chain structures assumes a single center of interactions per residue and accounts for preaveraged effects of four adjacent residue fragments. Knowledge-based statistical potentials encode complex patterns of these fragments. Using the Replica Exchange Monte Carlo sampling scheme and a generic version of the SURPASS force field we performed test simulations of a representative set of single-domain globular proteins. The method samples a significant part of conformational space and reproduces protein structures, including native-like, with surprisingly good accuracy. Future extension of the SURPASS model on large biomacromolecular systems is briefly discussed.
Keywords:coarse-grained models, de novo protein folding, empirical force field, knowledge-based potential, protein modeling, reduced models
SURPASS - Single United Residue per Pre-Averaged Secondary Structure fragment is a new low-resolution coarse-grained model for protein simulations that can be an interesting alternative for existing coarse-grained models. Deep simplification of SURPASS representation results in a powerful computational speed-up. More details on the SURPASS model can be found on SURPASS wiki.
Below, schematic illustration presents the projection from all-atom to coarse-grained SURPASS representation for two protein fragments: helical (left panel) and β-strand (right panel).
The SURPASS force field consists of a set of general, sequentially independent short- and long-range potentials. In the basic variant of the SURPASS force field, sequence dependence is defined only by assigning the preferred secondary structure in the three-letter code (H, E, C) to pseudo atoms. Knowledge-based statistics of the SURPASS force field can be downloaded from here.
SURPASS representation and force field can be used in various simulation protocols, including molecular dynamics, Monte Carlo, and other modeling techniques. In this work work, the simulation process was controlled by the replica exchange Monte Carlo dynamics scheme (REMC). The model has been tested on a representative set of single-domain globular proteins containing from 56 to 334 amino acid residues. SURPASS allows very efficient sampling of the entire conformational space of polypeptide chains and the accuracy of the resulting native-like models (measured by the RMSDSURPASS) is surprisingly good for such a level of coarse-graining.