Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/2727
Title: Serbian text categorization using byte level n-grams
Authors: Graovac, Jelena 
Affiliations: Informatics and Computer Science 
Issue Date: 2012
Rank: M33
Publisher: CEUR
Related Publication(s): Local Proceedings of the Fifth Balkan Conference in Informatics
Journal: CEUR Workshop Proceedings
Conference: Balkan Conference in Informatics BCI (5 ; 2012 ; Novi Sad)
Abstract: 
This paper presents the results of classifying Serbian text
documents using the byte-level n-gram based frequency statistics technique, employing four different dissimilarity measures. Results show that the byte-level n-grams text categorization, although very simple and language independent,achieves very good accuracy.
URI: https://research.matf.bg.ac.rs/handle/123456789/2727
Appears in Collections:Research outputs

Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.