Please use this identifier to cite or link to this item:
https://research.matf.bg.ac.rs/handle/123456789/1653
Title: | Model sistema za ekstrakciju informacija iz tekstova pisanih na srpskom jeziku | Authors: | Vujičić Stanković, Staša | Affiliations: | Informatics and Computer Science | Keywords: | Ontology-Based Information Extraction,;semantic web;natural language processing;Serbian language | Issue Date: | 2013 | Publisher: | Beograd : Fakultet organizacionih nauka | Journal: | Info M | Abstract: | U ovom radu je predstavljena narastajuća potreba za nadogradnjom postojećih sistema za ekstrakciju, izdvajanje relevantnih informacija za korisnika, semantičkim podacima i tehnologijama. Takođe su predstavljeni i problemi vezani za obradu sve većeg broja dokumenata u digitalnom obliku pisanih na srpskom jeziku. Pored problema koji potiču iz same prirode srpskog, morfološki bogatog jezika, sagledani su i problemi nedostatka semantičkih resursa i alata za srpski jezik. U skladu sa navedenim, prikazan je razvoj modela sistema za ekstrakciju informacija vođenu ontologijama za tekstove na srpskom jeziku zasnovan na integraciji postojećih resursa koji su razvijeni za obradu kako tekstova na srpskom jeziku, tako i onih koji su namenjeni obradi tekstova na engleskom jeziku; prilagođavanju postojećih tehnika i alata i novim načinima njihove primene u cilju prevazilaženja navedenih problema. Glavni doprinos razvoja ovog modela jeste podsticanje razvoja sistema za izdvajanje informacija relevantnih za korisnika, bazirano na ontologijama, za različite oblasti i primene. This paper motivates the need for addressing the problem of enriching Information Extraction systems with the Semantic Web data and technologies, stressing the existing problems related to Serbian. Due to the core nature of Serbian, morphologically rich language, which leads to different types of issues in natural language processing tasks and especially Information Extraction process, and the lack of semantic resources and tools for Serbian, it is necessary to develop techniques for overcoming these problems, with particular emphases to take advantage of existing English resources and tools. The model for Ontology-Based Information Extraction is proposed to deal with the imperfection of Information Extraction related to Serbian, and to enhance essential Information Extraction process by incorporating semantic knowledge encapsulated in the ontologies. The scope of this paper is to describe the model based on: integration of existing resources that have been developed for the processing of both texts in the Serbian language, and those that are aimed to be used for processing of texts in English; adapting existing techniques and tools; and inventing new ways of their implementation in order to overcome significant challenges. We believe this model will encourage development of the Ontology-Based Information Extraction systems for specific domains and applications, dealing the increasing volume of data in Serbian. |
URI: | https://research.matf.bg.ac.rs/handle/123456789/1653 | ISSN: | 1451–4397 |
Appears in Collections: | Research outputs |
Show full item record
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.