Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/2013
Title: HerCulB: content-based information extraction and retrieval for cultural heritage of the Balkans
Authors: Tanasijević, Ivana 
Pavlović-Lažetić, Gordana
Affiliations: Informatics and Computer Science 
Keywords: Content-based search;Information extraction;Intangible cultural heritage;Natural language processing
Issue Date: 12-Dec-2020
Rank: M22
Publisher: Emerald Publishing
Journal: Electronic Library
Abstract: 
Purpose: The purpose of this paper is to provide a methodology for automatic annotation of a multimedia collection of intangible cultural heritage mostly in the form of interviews. Assigned annotations provide a way to search the collection. Design/methodology/approach: Annotation is based on automatic extraction of metadata and is conducted by named entity and topic extraction from textual descriptions with a rule-based approach supported by vocabulary resources, a compiled domain-specific classification scheme and domain-oriented corpus analysis. Findings: The proposed methodology for automatic annotation of a collection of intangible cultural heritage, applied on the cultural heritage of the Balkans, has very good results according to F measure, which is 0.87 for the named entity and 0.90 for topic annotation. The overall methodology enables encapsulating domain-specific and language-specific knowledge into collections of finite state transducers and allows further improvements. Originality/value: Although cultural heritage has a significant role in the development of identity of a group or an individual, it is one of those specific domains that have not yet been fully explored in case of many languages. A methodology is proposed that can be used for incorporating natural language processing techniques into digital libraries of cultural heritage.
URI: https://research.matf.bg.ac.rs/handle/123456789/2013
ISSN: 02640473
DOI: 10.1108/EL-03-2020-0052
Appears in Collections:Research outputs

Show full item record

SCOPUSTM   
Citations

7
checked on May 12, 2025

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.