Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/327
DC FieldValueLanguage
dc.contributor.authorMaljković Ružičić, Mirjanaen_US
dc.contributor.authorMitić, Nenaden_US
dc.contributor.authorde Brevern, Alexandre Gen_US
dc.date.accessioned2022-08-09T12:47:18Z-
dc.date.available2022-08-09T12:47:18Z-
dc.date.issued2022-
dc.identifier.issn03009084-
dc.identifier.urihttps://research.matf.bg.ac.rs/handle/123456789/327-
dc.description.abstract3D protein structures determine proteins' biological functions. The 3D structure of the protein backbone can be approximated using the prototypes of local protein conformations. Sets of these prototypes are called structural alphabets (SAs). Amongst several approaches to the prediction of 3D structures from amino acid sequences, one approach is based on the prediction of SA prototypes for a given amino acid sequence. Protein Blocks (PBs) is the most known SA, and it is composed of 16 prototypes of five consecutive amino acids which were identified as optimal prototypes considering the ability to correctly approximate the local structure and the prediction accuracy of prototypes from an amino acid sequence. We developed models for PBs prediction from sequence information using different data mining approaches and machine learning algorithms. Besides the amino acid sequences, the results of the following tools were used to train the models: the Spider3 predictor of protein structure properties, several predictors of the protein's intrinsically disordered regions, and a tool for finding repeats in amino acid sequences. The highest accuracy of the constructed models is 80%, which is a significant improvement compared to the previous best available prediction, whose accuracy was 61%. Analyzing the models constructed by applying different algorithms, it was noticed that the significance of input attributes differs among the models constructed by algorithms. Using the information about amino acids belonging to intrinsically disordered regions and repeats improves the precision of prediction for some PBs using the CART classification algorithm, while this is not the case with the C5.0 classification algorithm. Improved prediction approaches can have interesting applications in protein structural model approaches or computational protein design.en_US
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.relation.ispartofBiochimieen_US
dc.subjectAmino acid sequenceen_US
dc.subjectDisorder predictorsen_US
dc.subjectMachine learningen_US
dc.subjectProtein blocksen_US
dc.subjectRepeatsen_US
dc.subjectSpider3en_US
dc.titlePrediction of structural alphabet protein blocks using data miningen_US
dc.typeArticleen_US
dc.identifier.doi10.1016/j.biochi.2022.01.019-
dc.identifier.pmid35143919-
dc.identifier.scopus2-s2.0-85124655573-
dc.identifier.isi000820447500007-
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/85124655573-
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.relation.issn0300-9084en_US
dc.description.rankM22en_US
dc.relation.firstpage74en_US
dc.relation.lastpage85en_US
dc.relation.volume197en_US
item.openairetypeArticle-
item.fulltextNo Fulltext-
item.cerifentitytypePublications-
item.grantfulltextnone-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.languageiso639-1en-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.orcid0000-0002-4390-9631-
Appears in Collections:Research outputs
Show simple item record

SCOPUSTM   
Citations

3
checked on Mar 6, 2025

Page view(s)

27
checked on Jan 19, 2025

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.