Please use this identifier to cite or link to this item:
https://research.matf.bg.ac.rs/handle/123456789/2918| Title: | Ontology-driven conceptual document classification | Authors: | Graovac, Jelena Pavlović-Lažetić, Gordana |
Affiliations: | Informatics and Computer Science | Keywords: | Artificial intelligence;clustering and classification methods;knowledge discovery and information retrieval;knowledge-based systems;symbolic systems | Issue Date: | 2010 | Rank: | M33 | Publisher: | SciTePress | Related Publication(s): | Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR 2010) | Conference: | International Conference on Knowledge Discovery and Information Retrieval (KDIR) ([2] ; 2010 ; Valencia) | Abstract: | Document classification based on the lexical-semantic network, wordnet, is presented. Two types of document classification in Serbian have been experimented with – classification based on chosen concepts from Serbian WordNet (SWN) and proper names-based classification. Conceptual document classification criteria are constructed from hierarchies rooted in a set of chosen concepts (first case) or in hierarchies rooted in some of the proper names' hypernyms (second case). A classificator of the first type is trained and then tested on an indexed and already classified Ebart corpus of Serbian newspapers (476917 articles). Precision, recall and F-measure show that this type of classification is promising although incomplete due mainly to SWN incompleteness. In the context of proper names-based classification, a proper names ontology based on the SWN is presented in the paper. A distance based similarity measure is defined, based on Euclidean and Manhattan distances. Classification of a su (More) |
URI: | https://research.matf.bg.ac.rs/handle/123456789/2918 | DOI: | 10.5220/0003063903830386 |
| Appears in Collections: | Research outputs |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.