Please use this identifier to cite or link to this item:
https://research.matf.bg.ac.rs/handle/123456789/2087
Title: | Prediction of alphabets of local protein structures using data mining methods | Authors: | Maljković Ružičić, Mirjana | Affiliations: | Informatics and Computer Science | Keywords: | data mining;Structural alphabet;prediction model;Protein blocks | Issue Date: | 2019 | Rank: | M70 | Publisher: | Beograd : Matematički fakultet | Abstract: | Proteins are linear biological polymers composed of amino acids whose structure and function are determined by the number and order of amino acids. The structure of the protein has three levels: primary, secondary and tertiary (three-dimensional, 3D) structure. Since the experimental determination of protein 3D structure is expensive and time-consuming, it is important to develop predictors of protein 3D structure properties from the amino acid sequence (primary structure), such as 3D structure of the protein backbone. The 3D structure of the backbone can be described using prototypes of local protein structure, i.e. prototypes of protein fragments with a length of few amino acids. A set of local structure prototypes determines the library of local protein structures, also called the structural alphabet. A structural alphabet is defined as a set of N prototypes of L amino acid length. The subject of this dissertation is the development of models for the prediction of structural alphabet prototypes for a given amino acid sequence using different data mining approaches. As one of the most known, structural alphabet Protein Blocks (PBs) was used in one part of the doctorial research. Structural alphabet PBs consists of 16 prototypes that are defined using fragments of 5 consecutive amino acids. The amino acid sequence is combined with the structural properties of a protein that can be determined based on amino acid sequence (occurrence of repeats in the amino acid sequence) and results of predictors of protein structural properties (backbone angles, secondary structures, occurrence of disordered regions, accessible surface area of amino acids) as an input to the prediction model of structural alphabet prototypes. Besides the development of models for prediction of prototypes of existing structural alphabet, the analysis of the capability of developing new structural alphabets is researched by applying the TwoStep clustering algorithm and construction of models for the prediction of prototypes of new structural alphabets. Several structural alphabets, which differ in the length of prototypes and the number of prototypes, have been constructed and analyzed. Fragments of the large number of proteins, whose structure is experimentally determined, were used to construct the new structural alphabets. |
URI: | https://research.matf.bg.ac.rs/handle/123456789/2087 |
Appears in Collections: | Research outputs |
Show full item record
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.