Andrzej Kolinski Research Group

Coarse-grained protein modeling

Modeling Software & Servers

Biomolecules — dynamics & interactions

0
-1
-2
-3

SURPASS

Short description: 
SURPASS coarse-grained protein model of low-resolution

 

Design of the SURPASS (Single United Residue per Pre-Averaged Secondary Structure fragment) protein model is unique and strongly supported by the statistical analysis of structural regularities characteristic for protein systems. Coarse-graining of protein chain structures assumes a single center of interactions per residue and accounts for pre-averaged effects of four adjacent residue fragments. Knowledge-based statistical potentials encode complex interaction patterns of these fragments.

The SURPASS software is available free of charge to academic use as a standalone program from the Laboratory repository. Note that SURPASS is implemented in C++11 as a part of Bioshell 3.0 package and needs to be compiled (we recommend using the g++ ver. 4.9 compiler). After successful package compilation, you will find an executable program surpass in the bin directory. The knowledge-based statistics of the SURPASS force field can be downloaded from here.

In order to increase the usability of the model, a non-trivial algorithm for the reconstruction of Cα-trace from SURPASS pseudo atoms was developed. Reconstruction of SURPASS representations to higher resolution levels is not trivial due to the strongly averaged nature of the model. Using the SUReLib, SURPASS Rebuild Library of fragments, it is possible to switch between the SURPASS model and the chain composed of alpha carbons. The knowledge-based pairs of fragments can be downloaded from here and a more detailed description of the SUReLib library can be found here. The reconstructed alpha carbon positions reproduce the local geometry of the polypeptide chain and correct spatial orientation of secondary structure elements.

More details on the SURPASS model and its applications can be found in the following publications:

 and services:

The movie shows example protein folding simulation using SURPASS model for 2gb1 protein system.

 

The SUReLib library of fragments is composed of 300 pairs of unique 5-residue long fragments in the SURPASS representation and the corresponding 6-residue long fragments of the chain made of alpha carbons. The repository is divided into 3 categories according to the type of secondary structure of the fragment: helical (type 0) in the number of 198 fragments, 88 of beta (type 1) and 14 of mainly unstructured fragments or loops (type 0). The content of SUReLib has been selected using clustering of 24106 pairs of fragments, and picking representatives of the most dense clusters. Finally, the library of fragments has been subject of rototranslational superposition and ordering of SURPASS and Cα-carbon pairs in the library to simplify its further use in chain reconstruction procedures.

 

SUReLib file content:

type S1.x S1.y S1.z (S2 - S5) CA1.x CA1.y CA1.z (CA2 - CA6) counts
0 -3.05 0.10 0.56 ... -3.87 1.51 1.42 ... 180
...                  
1 -6.39 -0.27 0.55 ... -7.86 0.51 1.10 ... 30

 

The local geometry of the rebuilt structures reproduces the regularities observed in known experimental structures. The orientation of alpha carbons in the secondary structure fragments of β-type is also retained. It means that the neighboring pseudo atoms in a single strand have opposite orientation to the β-sheet surface, and the pseudo atoms lying in neighboring strands within one β-sheet and connected by a coarse-grained hydrogen bond have the same orientation. Further reconstruction from the level of the Cα-chain to the complete main chain or full-atomic detail is a solved problem. In this context, the structural accuracy of the model is in the range of 2 - 3 Å. This is the acceptable resolution range for known all-atom structure optimisation protocols.