Prediction of alphabets of local protein structures using data mining methods

Maljković Ružičić, Mirjana

Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/2087

Title:	Prediction of alphabets of local protein structures using data mining methods
Authors:	Maljković Ružičić, Mirjana
Affiliations:	Informatics and Computer Science
Keywords:	data mining;Structural alphabet;prediction model;Protein blocks
Issue Date:	2019
Rank:	M70
Publisher:	Beograd : Matematički fakultet
Abstract:	Proteins are linear biological polymers composed of amino acids whose structure and function are determined by the number and order of amino acids. The structure of the protein has three levels: primary, secondary and tertiary (three-dimensional, 3D) structure. Since the experimental determination of protein 3D structure is expensive and time-consuming, it is important to develop predictors of protein 3D structure properties from the amino acid sequence (primary structure), such as 3D structure of the protein backbone. The 3D structure of the backbone can be described using prototypes of local protein structure, i.e. prototypes of protein fragments with a length of few amino acids. A set of local structure prototypes determines the library of local protein structures, also called the structural alphabet. A structural alphabet is defined as a set of N prototypes of L amino acid length. The subject of this dissertation is the development of models for the prediction of structural alphabet prototypes for a given amino acid sequence using different data mining approaches. As one of the most known, structural alphabet Protein Blocks (PBs) was used in one part of the doctorial research. Structural alphabet PBs consists of 16 prototypes that are defined using fragments of 5 consecutive amino acids. The amino acid sequence is combined with the structural properties of a protein that can be determined based on amino acid sequence (occurrence of repeats in the amino acid sequence) and results of predictors of protein structural properties (backbone angles, secondary structures, occurrence of disordered regions, accessible surface area of amino acids) as an input to the prediction model of structural alphabet prototypes. Besides the development of models for prediction of prototypes of existing structural alphabet, the analysis of the capability of developing new structural alphabets is researched by applying the TwoStep clustering algorithm and construction of models for the prediction of prototypes of new structural alphabets. Several structural alphabets, which differ in the length of prototypes and the number of prototypes, have been constructed and analyzed. Fragments of the large number of proteins, whose structure is experimentally determined, were used to construct the new structural alphabets.
URI:	https://research.matf.bg.ac.rs/handle/123456789/2087
Appears in Collections:	Research outputs

Show full item record

Google Scholar^TM

Check

Google ScholarTM

Google Scholar^TM