Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/1407
DC FieldValueLanguage
dc.contributor.authorMitić, Nenaden_US
dc.contributor.authorMalkov, Sašaen_US
dc.contributor.authorMaljković Ružičić, Mirjanaen_US
dc.contributor.authorVeljković, Aleksandar N.en_US
dc.contributor.authorČukić, Ivan Ljen_US
dc.contributor.authorLin, Xinen_US
dc.contributor.authorLyu, Minjieen_US
dc.contributor.authorBrusić, Vladimiren_US
dc.date.accessioned2025-01-16T07:40:54Z-
dc.date.available2025-01-16T07:40:54Z-
dc.date.issued2025-01-06-
dc.identifier.urihttps://research.matf.bg.ac.rs/handle/123456789/1407-
dc.description.abstractWhen applying data mining or machine learning techniques to large and diverse datasets, it is often necessary to construct descriptive and predictive models. Descriptive models are used to discover relationships between the attributes of the data while predictive models identify the characteristics of the data that will be collected in the future. Bioinformatics data is high-dimensional, making it practically impossible to apply the majority of “classical” algorithms for classification and clustering. Even if the algorithms are useful, training with large multidimensional data significantly increases processing time. The algorithms specialized for working with high-dimensional data often cannot process data containing large data sets with several thousand dimensions (features). Dimension reduction methods (such as PCA) do not provide satisfactory results, and also obscure the meaning of the original attributes in the data. For the constructed models to be usable, they must fulfill the requirement of scalability, as the amount of bioinformatics data is increasing rapidly. Furthermore, the significance of individual data features can differ from source to source. This paper describes an attribute selection method for efficient classification of high-dimensional (30,698) transcriptomics data collected from different sources. The proposed method was tested with 22 classification algorithms. The classification results for the selected attribute sets are comparable to the results for the complete attribute set.en_US
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.relation.ispartofJournal of Big Dataen_US
dc.subjectClassificationen_US
dc.subjectFeature selectionen_US
dc.subjectHigh-dimensional dataen_US
dc.subjectTranscriptomics dataen_US
dc.titleCorrelation-based feature selection of single cell transcriptomics data from multiple sourcesen_US
dc.typeArticleen_US
dc.identifier.doi10.1186/s40537-024-01051-z-
dc.identifier.scopus2-s2.0-85214266065-
dc.identifier.isi001390532000001-
dc.identifier.urlhttps://api.elsevier.com/content/abstract/scopus_id/85214266065-
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.relation.issn2196-1115en_US
dc.description.rankM21aen_US
dc.relation.firstpageArticle no. 4en_US
dc.relation.volume12en_US
dc.relation.issue1en_US
item.openairetypeArticle-
item.fulltextNo Fulltext-
item.cerifentitytypePublications-
item.grantfulltextnone-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.languageiso639-1en-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.orcid0000-0002-4385-6322-
crisitem.author.orcid0000-0002-4390-9631-
Appears in Collections:Research outputs
Show simple item record

Page view(s)

7
checked on Jan 19, 2025

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.