Building an emotion lexicon for Serbian using curated language resources

Šošić, Milena; Graovac, Jelena; Stanković, Ranka

Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/3218

DC Field	Value	Language
dc.contributor.author	Šošić, Milena	en_US
dc.contributor.author	Graovac, Jelena	en_US
dc.contributor.author	Stanković, Ranka	en_US
dc.date.accessioned	2026-03-18T17:57:34Z	-
dc.date.available	2026-03-18T17:57:34Z	-
dc.date.issued	2026-03-01	-
dc.identifier.issn	1574020X	-
dc.identifier.uri	https://research.matf.bg.ac.rs/handle/123456789/3218	-
dc.description.abstract	This article introduces a methodology for developing the first emotional affect lexicon for the Serbian language. The proposed methodology involves leveraging a Large Language Model (LLM), specifically the GPT-3-based gpt-3.5-turbo and GPT-4-based gpt-4.1 models, in conjunction with the Serbian WordNet language resource to align the English lexicon with Serbian-specific morphological and linguistic characteristics. The effectiveness of the Serbian emotion lexicon (EmoLex.SR), comprising 13,584 affective words, has been validated through emotion detection experiments using emotion-annotated corpora. The experiments demonstrated outstanding performance compared to the NRC lexicon automatically translated into Serbian, achieving a macro F1 score of 74.4% for sentences written in Serbian. In particular, the lexicon outperforms its automatically translated counterpart in detecting emotional categories across three distinct datasets, with an average improvement by 14.7% in terms of macro F1 score. The development of the EmoLex.SR lexicon and the accompanying annotated parallel corpora, referred to as LLM-Emo.SR, extends the emotion detection capabilities for Serbian language processing. This enables a more accurate interpretation of emotions in Serbian text and enhances Natural Language Processing applications for the Serbian language. Although the methodology for creating the lexicon is demonstrated for Serbian, it can also be successfully applied to other languages. The lexicon is made publicly available to the scientific community for use and further refinement.	en_US
dc.language.iso	en	en_US
dc.publisher	Springer	en_US
dc.relation.ispartof	Language Resources and Evaluation	en_US
dc.subject	Affect	en_US
dc.subject	Emotions	en_US
dc.subject	Lexicons	en_US
dc.subject	Serbian	en_US
dc.subject	WordNet	en_US
dc.title	Building an emotion lexicon for Serbian using curated language resources	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1007/s10579-025-09894-5	-
dc.identifier.scopus	2-s2.0-105027264089	-
dc.identifier.isi	001655357900002	-
dc.identifier.url	https://api.elsevier.com/content/abstract/scopus_id/105027264089	-
dc.contributor.affiliation	Informatics and Computer Science	en_US
dc.relation.issn	1574-020X	en_US
dc.description.rank	M22	en_US
dc.relation.firstpage	Article no. 9	en_US
dc.relation.volume	60	en_US
dc.relation.issue	1	en_US
item.fulltext	No Fulltext	-
item.grantfulltext	none	-
item.openairetype	Article	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.languageiso639-1	en	-
item.cerifentitytype	Publications	-
crisitem.author.dept	Informatics and Computer Science	-
crisitem.author.orcid	0000-0002-9323-4695	-
Appears in Collections:	Research outputs

Show simple item record

Google Scholar^TM

Check

Google Scholar^TM

Altmetric

Altmetric

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM