Please use this identifier to cite or link to this item:
https://research.matf.bg.ac.rs/handle/123456789/1654
Title: | Nov metod ekstrakcije informacija baziran na transduktorima | Authors: | Pajić, Vesna Pajić, Miloš Vujičić Stanković, Staša |
Affiliations: | Informatics and Computer Science | Keywords: | Information extraction;natural language processing;data structuring | Issue Date: | 2012 | Publisher: | Beograd : Fakultet organizacionih nauka | Journal: | Info M | Abstract: | U radu je dat osvrt na oblast ekstrakcije informacije, čije su metode i tehnike nezaobilazne u pretrazi i upravljanju informacijama. Ova oblast u sebi sadrži tehnike drugih oblasti matematike i računarstva, kao što su obrada prirodnih jezika, teorija formalnih jezika, verovatnoća i statistika. Uzimajući u obzir sve specifičnosti zahteva za informacijom i tekstualnih resursa iz kojih se izdvajanje vrši, razvijen je i u radu prikazan nov metod za ekstrakciju informacija nazvan Dvofazni metod baziran na transduktorima. Predstavljena je arhitektura sistema koji implementira ovaj metod kao i primer konkretne primene. Poseban značaj ovaj metod ima u situacijama kada ne postoje već pripremljeni tekstualni korpusi, neophodni za primenu postojećih metoda, posebno onih baziranih na verovatnoći i statistici. An overview of the information extraction is given, whose methods and techniques are indispensable in the information search and information management. Information extraction uses and combines techniques and methods from mathematics and computer science, such as natural language processing, formal language theory, probability and statistics. Taking into account all specifics of requests for information and textual resources from which it is extracted, we developed and present a new method for the information extraction called a two-stage method based on the transducers. An architecture of a system that implements this method is presented, along with an example of application. This method has the special significance in situations in which there is a lack of already annotated text corpora, that are necessary for the application of existing methods, especially those based on probability and statistics. |
URI: | https://research.matf.bg.ac.rs/handle/123456789/1654 | ISSN: | 1451–4397 |
Appears in Collections: | Research outputs |
Show full item record
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.