HerCulB: content-based information extraction and retrieval for cultural heritage of the Balkans

Tanasijević, Ivana; Pavlović-Lažetić, Gordana

Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/2013

DC Field	Value	Language
dc.contributor.author	Tanasijević, Ivana	en_US
dc.contributor.author	Pavlović-Lažetić, Gordana	en_US
dc.date.accessioned	2025-05-12T11:36:26Z	-
dc.date.available	2025-05-12T11:36:26Z	-
dc.date.issued	2020-12-12	-
dc.identifier.issn	02640473	-
dc.identifier.uri	https://research.matf.bg.ac.rs/handle/123456789/2013	-
dc.description.abstract	Purpose: The purpose of this paper is to provide a methodology for automatic annotation of a multimedia collection of intangible cultural heritage mostly in the form of interviews. Assigned annotations provide a way to search the collection. Design/methodology/approach: Annotation is based on automatic extraction of metadata and is conducted by named entity and topic extraction from textual descriptions with a rule-based approach supported by vocabulary resources, a compiled domain-specific classification scheme and domain-oriented corpus analysis. Findings: The proposed methodology for automatic annotation of a collection of intangible cultural heritage, applied on the cultural heritage of the Balkans, has very good results according to F measure, which is 0.87 for the named entity and 0.90 for topic annotation. The overall methodology enables encapsulating domain-specific and language-specific knowledge into collections of finite state transducers and allows further improvements. Originality/value: Although cultural heritage has a significant role in the development of identity of a group or an individual, it is one of those specific domains that have not yet been fully explored in case of many languages. A methodology is proposed that can be used for incorporating natural language processing techniques into digital libraries of cultural heritage.	en_US
dc.language.iso	en	en_US
dc.publisher	Emerald Publishing	en_US
dc.relation.ispartof	Electronic Library	en_US
dc.subject	Content-based search	en_US
dc.subject	Information extraction	en_US
dc.subject	Intangible cultural heritage	en_US
dc.subject	Natural language processing	en_US
dc.title	HerCulB: content-based information extraction and retrieval for cultural heritage of the Balkans	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1108/EL-03-2020-0052	-
dc.identifier.scopus	2-s2.0-85094122200	-
dc.identifier.isi	000586499800001	-
dc.identifier.url	https://api.elsevier.com/content/abstract/scopus_id/85094122200	-
dc.contributor.affiliation	Informatics and Computer Science	en_US
dc.relation.issn	0264-0473	en_US
dc.description.rank	M22	en_US
dc.relation.firstpage	905	en_US
dc.relation.lastpage	918	en_US
dc.relation.volume	38	en_US
dc.relation.issue	5-6	en_US
item.cerifentitytype	Publications	-
item.languageiso639-1	en	-
item.fulltext	No Fulltext	-
item.openairetype	Article	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.grantfulltext	none	-
crisitem.author.dept	Informatics and Computer Science	-
crisitem.author.orcid	0000-0003-3764-1269	-
Appears in Collections:	Research outputs

Show simple item record

SCOPUS^TM
Citations

9

checked on Jun 9, 2026

Page view(s)

4

checked on Jun 9, 2026

Google Scholar^TM

Check

SCOPUS^TM
Citations

Page view(s)

Google Scholar^TM

Altmetric

Altmetric

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM