De novo protein fold design through sequence-independent fragment assembly simulations.

de novo protein design novel fold of protein replica-exchange Monte Carlo simulation structural design structural motif

Journal

Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876

Informations de publication

Date de publication:
24 Jan 2023
Historique:
entrez: 19 1 2023
pubmed: 20 1 2023
medline: 24 1 2023
Statut: ppublish

Résumé

De novo protein design generally consists of two steps, including structure and sequence design. Many protein design studies have focused on sequence design with scaffolds adapted from native structures in the PDB, which renders novel areas of protein structure and function space unexplored. We developed FoldDesign to create novel protein folds from specific secondary structure (SS) assignments through sequence-independent replica-exchange Monte Carlo (REMC) simulations. The method was tested on 354 non-redundant topologies, where FoldDesign consistently created stable structural folds, while recapitulating on average 87.7% of the SS elements. Meanwhile, the FoldDesign scaffolds had well-formed structures with buried residues and solvent-exposed areas closely matching their native counterparts. Despite the high fidelity to the input SS restraints and local structural characteristics of native proteins, a large portion of the designed scaffolds possessed global folds completely different from natural proteins in the PDB, highlighting the ability of FoldDesign to explore novel areas of protein fold space. Detailed data analyses revealed that the major contributions to the successful structure design lay in the optimal energy force field, which contains a balanced set of SS packing terms, and REMC simulations, which were coupled with multiple auxiliary movements to efficiently search the conformational space. Additionally, the ability to recognize and assemble uncommon super-SS geometries, rather than the unique arrangement of common SS motifs, was the key to generating novel folds. These results demonstrate a strong potential to explore both structural and functional spaces through computational design simulations that natural proteins have not reached through evolution.

Identifiants

pubmed: 36656852
doi: 10.1073/pnas.2208275120
pmc: PMC9942881
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e2208275120

Subventions

Organisme : NCI NIH HHS
ID : U24 CA210967
Pays : United States
Organisme : NIH HHS
ID : S10 OD026825
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI134678
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM136422
Pays : United States
Organisme : NIEHS NIH HHS
ID : P30 ES017885
Pays : United States

Références

Proteins. 2021 Dec;89(12):1734-1751
pubmed: 34331351
Nat Chem Biol. 2016 Jan;12(1):29-34
pubmed: 26595462
Nature. 2019 Jan;565(7738):186-191
pubmed: 30626941
Proteins. 2004 Dec 1;57(4):702-10
pubmed: 15476259
Nat Methods. 2015 Jan;12(1):7-8
pubmed: 25549265
Proc Natl Acad Sci U S A. 1994 May 10;91(10):4436-40
pubmed: 8183927
Curr Opin Struct Biol. 2021 Jun;68:194-207
pubmed: 33639355
J Mol Biol. 2019 Jun 14;431(13):2467-2476
pubmed: 30851277
Bioinformatics. 2010 Apr 1;26(7):889-95
pubmed: 20164152
Nature. 2016 Sep 14;537(7620):320-7
pubmed: 27629638
Proc Natl Acad Sci U S A. 2004 May 18;101(20):7594-9
pubmed: 15126668
Proteins. 2013 Feb;81(2):229-39
pubmed: 22972754
J Mol Biol. 1993 Dec 5;234(3):779-815
pubmed: 8254673
Nature. 2017 Oct 5;550(7674):74-79
pubmed: 28953867
Proteins. 2019 Dec;87(12):1149-1164
pubmed: 31365149
Nature. 2002 Apr 11;416(6881):657-60
pubmed: 11948354
Proc Natl Acad Sci U S A. 2015 Oct 6;112(40):E5478-85
pubmed: 26396255
Acta Crystallogr D Biol Crystallogr. 2010 Jan;66(Pt 1):12-21
pubmed: 20057044
Biophys J. 2011 Nov 16;101(10):2525-34
pubmed: 22098752
Science. 2020 Sep 25;369(6511):1637-1643
pubmed: 32820060
Protein Sci. 2019 Apr;28(4):678-683
pubmed: 30746840
Bioinformatics. 2016 Feb 1;32(3):378-87
pubmed: 26471454
Nature. 2013 Sep 12;501(7466):212-216
pubmed: 24005320
Nature. 2022 Feb;602(7897):523-528
pubmed: 35140398
Science. 1973 Jul 20;181(4096):223-30
pubmed: 4124164
Science. 2016 May 6;352(6286):687-90
pubmed: 27151863
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Biophys J. 2011 Oct 19;101(8):2043-52
pubmed: 22004759
Bioinformatics. 2020 Feb 15;36(4):1135-1142
pubmed: 31588495
J Chem Inf Model. 2018 May 29;58(5):895-901
pubmed: 29659276
Proc Natl Acad Sci U S A. 2006 Feb 21;103(8):2605-10
pubmed: 16478803
Nature. 2018 Sep;561(7724):485-491
pubmed: 30209393
J Mol Biol. 2003 Sep 12;332(2):449-60
pubmed: 12948494
Nature. 2021 Dec;600(7889):547-552
pubmed: 34853475
BMC Bioinformatics. 2014 Sep 18;15:307
pubmed: 25236673
Nat Methods. 2022 Jan;19(1):13-14
pubmed: 35017724
Science. 2020 Oct 23;370(6515):426-431
pubmed: 32907861
Structure. 2010 Jul 14;18(7):858-67
pubmed: 20637422
Proc Natl Acad Sci U S A. 2022 Oct 25;119(43):e2206111119
pubmed: 36252041
Biopolymers. 1983 Dec;22(12):2577-637
pubmed: 6667333
PLoS Comput Biol. 2010 Apr 22;6(4):e1000750
pubmed: 20421995
Science. 2022 Jul 22;377(6604):387-394
pubmed: 35862514
J Chem Theory Comput. 2017 Jun 13;13(6):3031-3048
pubmed: 28430426
Bioinformatics. 2020 Apr 1;36(7):2105-2112
pubmed: 31738385
Nucleic Acids Res. 2014 Jan;42(Database issue):D304-9
pubmed: 24304899
Nature. 2012 Nov 8;491(7423):222-7
pubmed: 23135467
Protein Sci. 1997 Jun;6(6):1167-78
pubmed: 9194177
FEBS Lett. 2020 Jul;594(14):2199-2212
pubmed: 32324903
Nucleic Acids Res. 2005 Apr 22;33(7):2302-9
pubmed: 15849316
Curr Opin Struct Biol. 2011 Aug;21(4):452-9
pubmed: 21684149
J Mol Biol. 1997 Apr 25;268(1):209-25
pubmed: 9149153
Proc Natl Acad Sci U S A. 2005 Jan 25;102(4):1029-34
pubmed: 15653774
Nucleic Acids Res. 2006 Apr 14;34(7):2085-97
pubmed: 16617149
J Biol Chem. 2021 Jan-Jun;296:100558
pubmed: 33744284
J Mol Biol. 1997 Nov 7;273(4):789-96
pubmed: 9367772
Protein Sci. 2003 May;12(5):963-72
pubmed: 12717019
Science. 2003 Nov 21;302(5649):1364-8
pubmed: 14631033
Proteins. 2012 Jul;80(7):1715-35
pubmed: 22411565
BMC Biol. 2007 May 08;5:17
pubmed: 17488521

Auteurs

Robin Pearce (R)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109.

Xiaoqiang Huang (X)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109.

Gilbert S Omenn (GS)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109.
Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109.
Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109.
School of Public Health, University of Michigan, Ann Arbor, MI 48109.

Yang Zhang (Y)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109.
Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109.
Department of Computer Science, School of Computing, National University of Singapore 117417, Singapore.
Cancer Science Institute of Singapore, National University of Singapore 117599, Singapore.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Structural basis for molecular assembly of fucoxanthin chlorophyll

Koji Kato, Yoshiki Nakajima, Jian Xing et al.
1.00
Diatoms Photosystem I Protein Complex Chlorophyll Binding Proteins Cryoelectron Microscopy Light-Harvesting Protein Complexes
Humans Molecular Chaperones Brain Protein Folding Mutation

Brain malformations and seizures by impaired chaperonin function of TRiC.

Florian Kraft, Piere Rodriguez-Aliaga, Weimin Yuan et al.
1.00
Humans Chaperonin Containing TCP-1 Brain Seizures Protein Folding

Classifications MeSH