Preview

Informatics

Advanced search

Protein homodimers structure prediction based on deep neural network

https://doi.org/10.37661/1816-0301-2020-17-2-44-53

Abstract

Structural prediction of protein-protein complexes has important application in such domains as modeling of biological processes and drug design. Homodimers (complexes which consist of two identical proteins) are the most common type of protein complexes in nature but there is still no universal algorithm to predict their 3D structures. Experimental techniques to identify the structure of protein complex require enormous amount of time and resources, and each method has its own limitations. Recently Deep Neural Networks allowed to predict structures of individual proteins greatly prevailing in accuracy over other algorithmic approaches. Building on the idea of this approach, we developed an algorithm to model the 3D structure of homodimer based on deep learning. It consists of two major steps: at the first step a protein complex contact map is predicted with the deep convolutional neural network, and the second stage is used to predict 3D structure of homodimer based on obtained contact map and optimization procedure. The use of the neural network in combination with optimization procedure based on gradient descent method allowed to predict structures for protein homodimers. The suggested approach was tested and validated on a dataset of protein homodimers from Protein Data Bank (PDB). The developed procedure could be also used for evaluating protein homodimer models as one of the stages in drug compounds developing.

About the Authors

A. Y. Hadarovich
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus; Belarusian State University
Belarus

Anna Y. Hadarovich, Researcher

Minsk



A. A. Kalinouski
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Alexander A. Kalinouski, Researcher

Minsk



A. V. Tuzikov
The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
Belarus

Alexander V. Tuzikov, Corresponding Member, Dr. Sci. (Phys.-Math.), Professor, General Director

Minsk



References

1. Anfinsen C. B. Principles that govern the folding of protein chains. Science, 1973, vol. 181 (4096), pp. 223–230. https://doi.org/10.1126/science.181.4096.223

2. Lecun Y., Bengio Y., Hinton G. Deep learning. Nature, 2015, vol. 521 (7553), pp. 436–444. https://doi.org/10.1038/nature14539

3. Senior A. W., Evans R., Jumper J., Kirkpatrick J., Sifre L., …, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature, 2020, vol. 577 (7792), pp. 706–710. https://doi.org/10.1038/s41586-019-1923-7

4. Billings W. M., Hedelius B., Millecam T., Wingate D., Corte D. D. ProSPr: democratized implementation of alphafold protein distance prediction network. Biorxiv, 2019, p. 830273. https://doi.org/10.1101/830273

5. Kryshtafovych A. T., Schwede, Topf M., Fidelis K., Moult J. Critical assessment of methods of protein structure prediction (CASP) – Round XIII. Proteins: Structure, Function, and Bioinformatics, 2019, vol. 87 (12), pp. 1011–1020. https://doi.org/10.1002/prot.25823

6. Jones D. T., Kandathil S. M. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics, 2018, vol. 34, pp. 3308–3315.

7. Seemayer S., Gruber M., Söding J. CCMpred – fast and precise prediction of protein residue-residue contacts from correlated mutations. Bioinformatics, 2014, vol. 30 (21), pp. 3128–3130.

8. Jones D. T., Singh T., Kosciolek T., Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics, 2015, vol. 31 (7), pp. 999–1006.

9. Jiang Q., Jin X., Lee S.-J., Yao S. Protein secondary structure prediction: a survey of the state of the art. Journal of Molecular Graphics and Modelling, 2017, vol. 76, pp. 379–402. https://doi.org/10.1016/j.jmgm.2017.07.015

10. Skwark M. J., Raimondi D., Michel M., Elofsson A. Improved contact predictions using the recognition of protein like contact patterns. PLoS Computational Biology, 2014, vol. 10 (11), p. e1003889. https://doi.org/10.1371/journal.pcbi.1003889

11. Berman H. M. The protein data bank: a historical perspective. Acta Crystallographica Section A: Foundations of Crystallography, 2008, vol. 64 (1), pp. 88–95. https://doi.org/10.1107/S0108767307035623

12. Mou Y., Huang P.-S., Hsu F.-C., Huang S.-J., Mayo S. L. Computational design and experimental verification of a symmetric protein homodimer. Proceedings of the National Academy of Sciences of the United States of America, 2015, vol. 112 (34), pp. 10714–10719. https://doi.org/10.1073/pnas.1505072112

13. Long J., Shelhamer E., Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, Massachusetts, USA, 7–12 June 2015. Boston, 2015, pp. 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965

14. Fu J., Liu J., Tian H., Li Y., Bao Y., …, Lu H. Dual attention network for scene segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, California, USA, 15–20 June 2019. Long Beach, 2019, pp. 3141–3149. Available at: http://arxiv.org/abs/1809.02983 (accessed 27.03.2020).

15. He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, USA, 27–30 June 2016. Las Vegas, 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

16. Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A. L. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, vol. 40 (4), pp. 834–848. https://doi.org/10.1109/TPAMI.2017.2699184

17. Kingma D. P., Ba J. Adam: a method for stochastic optimization. 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. San Diego, 2015. Available at: https://arxiv.org/abs/1412.6980 (accessed 27.03.2020).

18. Mitternacht S. FreeSASA: an open source C library for solvent accessible surface area calculations. F1000Research, 2016, vol. 5, p. 189. https://doi.org/10.12688/f1000research.7931.1

19. Janin J., Bahadur R. P., Chakrabarti P. Protein-protein interaction and quaternary structure. Quarterly Reviews of Biophysics, 2008, vol. 41 (2), pp. 133–180. https://doi.org/10.1017/S0033583508004708

20. Cock P. J., Antao T., Chang J. T., Chapman B. A., Cox C. J., …, de Hoon M. J. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 2009, vol. 25 (11), pp. 1422–1423. https://doi.org/10.1093/bioinformatics/btp163


Review

For citations:


Hadarovich A.Y., Kalinouski A.A., Tuzikov A.V. Protein homodimers structure prediction based on deep neural network. Informatics. 2020;17(2):44-53. (In Russ.) https://doi.org/10.37661/1816-0301-2020-17-2-44-53

Views: 736


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1816-0301 (Print)
ISSN 2617-6963 (Online)