## Aggrescan3D method

Aggrescan3D (A3D) is aimed to predict the aggregation propensities of proteins in their folded states. Towards this aim A3D uses as input protein 3D-structures, derived from X-ray diffraction, solution nmr or modelling approaches in pdb format. The structures are energetically minimized before their analysis. The method exploits an experimentally derived intrinsic aggregation propensity scale for natural amino acids [1-2] and projects this scale in the protein 3D structure. In the A3D method the intrinsic aggregation propensity of each particular amino acid in the structure is modulated by its specific structural context. Aggregation propensity is calculated for spherical regions centred on every residue Cα carbon. This provides a unique structurally corrected aggregation value (A3D score) for each amino acid in the structure, which is formulated as:

$$A3D\ score = Agg_i × \left(α× e^{\beta × RSA_i} \right)+ \sum \left[Agg_e× \left(α× e^{\beta×RSA_e}\right) × \left(\gamma× e^{-\delta ×dist}\right) \right]$$

where: $$Agg_i$$ is the instrinsic aggregation propensity of the residue in the centre of the sphere; $$RSA_i$$ its relative surface area exposed to solvent; $$Agg_e$$ the instrinsic aggregation propensity of each additional residue included in the sphere, $$RSA_e$$ its relative surface area exposed to solvent and $$dist$$ its distance to the central residue $$i$$.

A3D discards the negligible contribution of highly hydrophobic residues hidden in the core of folded proteins to aggregation and focuses the prediction on protein surfaces. This structure-based approach identifies aggregation patches that are typically not contiguous in sequence like those identified by linear sequence or composition-based algorithms, outperforming them.

The identified aggregation prone residues or their surroundings can be virtually mutated to design variants with increased solubility. The selected mutation/mutations are modelled and a new A3D prediction is subsequently generated on top of this new structure. Recently, an experimental study [3] has shown that A3D predictions allow for designing mutations that improve proteins solubility without compromising their conformation or stability. The solubility of unrelated polypeptides was easily tuned by A3D-designed non-destabilizing mutations at the proteins’ surfaces.

The dynamic structural fluctuations that a protein experiments in solution influence its aggregation propensity, promoting partial exposure of usually buried residues. In this way, mutations leading to destabilized protein variants with increased conformational fluctuations usually have a huge impact on the aggregation propensity of the protein. For this reason A3D can be also run in Dynamic Mode. In this mode, A3D exploits the CABS-flex approach [4-5] for the fast simulations of near-native dynamics of globular proteins. The aggregation properties of the ensemble of protein models are analysed and the most aggregation-prone conformer is selected as a proxy of the aggregation promoting state in the particular protein of interest.

Aggrescan 2.0 introduces a new mode called "Enhance protein solubility" with automatically identifies the strongest APRs (aggregation prone regions) in the structure and suggests a series of point mutations that would increase the protein solubility without impacting its stability.

The A3D server pipeline

#### References

1. Conchillo-Sole, O., et al. AGGRESCAN: a server for the prediction and evaluation of "hot spots" of aggregation in polypeptides. BMC bioinformatics 2007;8:65.
2. de Groot, N.S., et al. AGGRESCAN: method, application, and perspectives for drug design. Methods Mol Biol 2012;819:199-220.
3. Gil-Garcia M., et al. Combining Structural Aggregation Propensity and Stability Predictions To Redesign Protein Solubility Mol Pharm. 2018;15(9):3846-3859.
4. Kuriata, A.*, Gierut, A.M.*, et al. CABS-flex 2.0: a web server for fast simulations of flexibility of protein structures. Nucleic Acids Research 2018;46(W1):W338-W343.
5. Kurcinski, M., et al. CABS-flex standalone: a simulation environment for fast modeling of protein flexibility. Bioinformatics, bty685.

### Aggrescan3D employs the following tools

• Aggrescan3D (aggregation propensity calculations based on 3D structures)
• FreeSASA (accessible surface calculations)
• FoldX (modeling of mutations in protein structures)
• Dynamic mode: CABS-flex (simulations of protein structure fluctuations and accompanying analysis)
• 3Dmol (interactive protein visualization)
• Pymol (protein visualization)
###### Page created using Flask framework, twitter bootstrap styles, font-awesome jQuery, d3.js, bokeh, hyphenator, and dataTables JS.

Laboratory of Theory of Biopolymers 2018