Model sistema za ekstrakciju informacija iz tekstova pisanih na srpskom jeziku

Vujičić Stanković, Staša

Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/1653

Title:	Model sistema za ekstrakciju informacija iz tekstova pisanih na srpskom jeziku
Authors:	Vujičić Stanković, Staša
Affiliations:	Informatics and Computer Science
Keywords:	Ontology-Based Information Extraction,;semantic web;natural language processing;Serbian language
Issue Date:	2013
Publisher:	Beograd : Fakultet organizacionih nauka
Journal:	Info M
Abstract:	U ovom radu je predstavljena narastajuća potreba za nadogradnjom postojećih sistema za ekstrakciju, izdvajanje relevantnih informacija za korisnika, semantičkim podacima i tehnologijama. Takođe su predstavljeni i problemi vezani za obradu sve većeg broja dokumenata u digitalnom obliku pisanih na srpskom jeziku. Pored problema koji potiču iz same prirode srpskog, morfološki bogatog jezika, sagledani su i problemi nedostatka semantičkih resursa i alata za srpski jezik. U skladu sa navedenim, prikazan je razvoj modela sistema za ekstrakciju informacija vođenu ontologijama za tekstove na srpskom jeziku zasnovan na integraciji postojećih resursa koji su razvijeni za obradu kako tekstova na srpskom jeziku, tako i onih koji su namenjeni obradi tekstova na engleskom jeziku; prilagođavanju postojećih tehnika i alata i novim načinima njihove primene u cilju prevazilaženja navedenih problema. Glavni doprinos razvoja ovog modela jeste podsticanje razvoja sistema za izdvajanje informacija relevantnih za korisnika, bazirano na ontologijama, za različite oblasti i primene. This paper motivates the need for addressing the problem of enriching Information Extraction systems with the Semantic Web data and technologies, stressing the existing problems related to Serbian. Due to the core nature of Serbian, morphologically rich language, which leads to different types of issues in natural language processing tasks and especially Information Extraction process, and the lack of semantic resources and tools for Serbian, it is necessary to develop techniques for overcoming these problems, with particular emphases to take advantage of existing English resources and tools. The model for Ontology-Based Information Extraction is proposed to deal with the imperfection of Information Extraction related to Serbian, and to enhance essential Information Extraction process by incorporating semantic knowledge encapsulated in the ontologies. The scope of this paper is to describe the model based on: integration of existing resources that have been developed for the processing of both texts in the Serbian language, and those that are aimed to be used for processing of texts in English; adapting existing techniques and tools; and inventing new ways of their implementation in order to overcome significant challenges. We believe this model will encourage development of the Ontology-Based Information Extraction systems for specific domains and applications, dealing the increasing volume of data in Serbian.
URI:	https://research.matf.bg.ac.rs/handle/123456789/1653
ISSN:	1451–4397
Appears in Collections:	Research outputs

Show full item record

Page view(s)

2

checked on Jun 7, 2026

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM