Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/3219
DC FieldValueLanguage
dc.contributor.authorŠošić, Milenaen_US
dc.contributor.authorStanković, Rankaen_US
dc.contributor.authorGraovac, Jelenaen_US
dc.date.accessioned2026-03-19T14:00:15Z-
dc.date.available2026-03-19T14:00:15Z-
dc.date.issued2024-
dc.identifier.urihttps://research.matf.bg.ac.rs/handle/123456789/3219-
dc.description.abstractIn the digital environment of South Slavic languages, emotion analysis in texts on social media is becoming increasingly important for understanding public opinion, creating personalized content, and analyzing user interactions. This presentation presents a detailed methodology and results of corpus annotation in the Serbian language according to Plutchik's categorization model, which identifies eight basic emotional categories: joy, sadness, anger, fear, trust, disgust, anticipation, and surprise. The aim of the research is to analyze the emotional content of texts taken from social media X (formerly Twitter) and Reddit, each collection containing around 17,000 individual messages and approximately 5,000 complete conversations. The corpus annotation process involved several stages: data collection and preparation, manual annotation by experts, verification of annotation accuracy, and statistical analysis of the harmonized labels. By using a multi-label annotation approach, a richer and more qualitative analysis of emotional states was made possible, with particular significance for the application in analyzing complex emotional content found on social media. To collect data, automated tools were used to download conversations written in Serbian from social media accounts that address current social, political, musical, and sports topics. Data preparation involved additional selection of messages to ensure the quality of their content, while maintaining the conversational structure of the retrieved data. During data preparation, messages were preliminarily annotated using automatic methods, employing both classical and advanced computational linguistics techniques to improve the efficiency of the manual labeling process. Teams of linguists and psychologists reviewed and assessed the automatically assigned labels for their validity concerning the textual content to which they were assigned. To ensure high accuracy and consistency, standardized procedures were used for training annotators and verifying their evaluations through statistical measures of annotation reliability. The analysis of annotation reliability demonstrated that it is possible to classify emotions in texts from social media in Serbian using Plutchik's model. Statistical data analysis revealed significant distributions of emotions in the messages and provided insights into users' emotional reactions to various emotional stimuli and thematic content. The multi-label categorized emotional corpus in Serbian Social-Emo.SR represents a significant advancement toward a deeper understanding of emotional dynamics on social media among users. In addition to enriching linguistic resources for the Serbian language, this corpus opens new possibilities for application in research, commercial applications, and enhancing mental health analysis of the population. The potential application of modern methodologies on the developed corpus would enable the creation of useful tools for recognizing and reflecting the complexity of human emotions in the current digital world within the Serbian-speaking community. The corpus will be published under open license CC-BY-4.0.en_US
dc.language.isoenen_US
dc.publisherBeograd : Filološki fakulteten_US
dc.subjectEmotionsen_US
dc.subjectPlutchik's modelen_US
dc.subjectannotationen_US
dc.subjectcorpusen_US
dc.subjectSocial mediaen_US
dc.subjectSerbian languageen_US
dc.titleSocial-Emo.Sr: Emotional Multi-Label Categorization of Conversational Messages from Social Networks X and Redditen_US
dc.title.alternativeSocial-Emo.SR: Emocionalna višeznačna kategorizacija konverzacionih poruka sa društvenih mreža X i Redditen_US
dc.typeConference Objecten_US
dc.relation.conferenceInternational Conference South Slavic Languages in Digital Environment (2024 ; Belgrade)en_US
dc.relation.publicationInternational Conference South Slavic Languages in Digital Environment JuDig : Book of Abstractsen_US
dc.identifier.urlhttps://judig.jerteh.rs/images/knjige/JUDIG-2024-book%20of%20abstracts.pdf-
dc.contributor.affiliationInformatics and Computer Scienceen_US
dc.relation.isbn978-86-6153-754-7en_US
dc.description.rankM34en_US
dc.relation.firstpage58en_US
dc.relation.lastpage59en_US
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.languageiso639-1en-
item.openairetypeConference Object-
item.cerifentitytypePublications-
item.grantfulltextnone-
item.fulltextNo Fulltext-
crisitem.author.deptInformatics and Computer Science-
crisitem.author.orcid0000-0002-9323-4695-
Appears in Collections:Research outputs
Show simple item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.