Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/108631
DC FieldValueLanguage
dc.contributor.authorMelo, Rita-
dc.contributor.authorFieldhouse, Robert-
dc.contributor.authorMelo, André-
dc.contributor.authorCorreia, João D. G.-
dc.contributor.authorCordeiro, Maria Natália D. S.-
dc.contributor.authorGümüş, Zeynep H.-
dc.contributor.authorCosta, Joaquim-
dc.contributor.authorBonvin, Alexandre M. J. J.-
dc.contributor.authorMoreira, Irina S.-
dc.date.accessioned2023-09-06T08:35:48Z-
dc.date.available2023-09-06T08:35:48Z-
dc.date.issued2016-07-27-
dc.identifier.issn1422-0067pt
dc.identifier.urihttps://hdl.handle.net/10316/108631-
dc.description.abstractUnderstanding protein-protein interactions is a key challenge in biochemistry. In this work, we describe a more accurate methodology to predict Hot-Spots (HS) in protein-protein interfaces from their native complex structure compared to previous published Machine Learning (ML) techniques. Our model is trained on a large number of complexes and on a significantly larger number of different structural- and evolutionary sequence-based features. In particular, we added interface size, type of interaction between residues at the interface of the complex, number of different types of residues at the interface and the Position-Specific Scoring Matrix (PSSM), for a total of 79 features. We used twenty-seven algorithms from a simple linear-based function to support-vector machine models with different cost functions. The best model was achieved by the use of the conditional inference random forest (c-forest) algorithm with a dataset pre-processed by the normalization of features and with up-sampling of the minor class. The method has an overall accuracy of 0.80, an F1-score of 0.73, a sensitivity of 0.76 and a specificity of 0.82 for the independent test set.pt
dc.language.isoengpt
dc.publisherMDPIpt
dc.relationSFRH/BPD/97650/2013pt
dc.relationUID/Multi/04349/2013pt
dc.relationFCT Investigator program—IF/00578/2014pt
dc.relationMarie Skłodowska-Curie Individual Fellowship MSCA-IF-2015 (MEMBRANEPROT 659826)pt
dc.relationUID/NEU/04539/2013pt
dc.relationCenter for Basic and Translational Research on Disorders of the Digestive System, Rockefeller University, through the generosity of the Leona M. and Harry B. Helmsley Charitable Trust and start-up funds of the Icahn School of Medicine at Mount Sinaipt
dc.rightsopenAccesspt
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt
dc.subjectprotein-protein interfacespt
dc.subjecthot-spotspt
dc.subjectmachine learningpt
dc.subjectSolvent Accessible Surface Area (SASA)pt
dc.subjectevolutionary sequence conservationpt
dc.subject.meshAlgorithmspt
dc.subject.meshComputational Biologypt
dc.subject.meshDatabases, Proteinpt
dc.subject.meshHumanspt
dc.subject.meshProtein Conformationpt
dc.subject.meshProtein Interaction Domains and Motifspt
dc.subject.meshProtein Interaction Mappingpt
dc.subject.meshProteinspt
dc.subject.meshMachine Learningpt
dc.titleA Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfacespt
dc.typearticle-
degois.publication.firstPage1215pt
degois.publication.issue8pt
degois.publication.titleInternational Journal of Molecular Sciencespt
dc.peerreviewedyespt
dc.identifier.doi10.3390/ijms17081215pt
degois.publication.volume17pt
dc.date.embargo2016-07-27*
uc.date.periodoEmbargo0pt
item.grantfulltextopen-
item.cerifentitytypePublications-
item.languageiso639-1en-
item.openairetypearticle-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.fulltextCom Texto completo-
crisitem.author.orcid0000-0003-2970-5250-
Appears in Collections:I&D CNC - Artigos em Revistas Internacionais
Show simple item record

Page view(s)

41
checked on May 8, 2024

Download(s)

2
checked on May 8, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons