Please use this identifier to cite or link to this item:
https://research.matf.bg.ac.rs/handle/123456789/316
Title: | Could n-gram analysis contribute to genomic island determination? | Authors: | Mitić, Nenad Pavlović-Lazetić, Gordana M Beljanski, Milos V |
Keywords: | Bacteria;Escherichia coli;Genomic islands;n-Grams | Issue Date: | 2008 | Journal: | Journal of biomedical informatics | Abstract: | There are two approaches to identifying genomic and pathogenesis islands (GI/PAIs) in bacterial genomes: the compositional and the functional, based on DNA or protein level composition and gene function, respectively. We applied n-gram analysis in addition to other compositional features, combined them by union and intersection and defined two measures for evaluating the results-recall and precision. Using the best criteria (by training on the Escherichia coli O157:H7 EDL933 genome), we predicted GIs for 14 Enterobacteriaceae family members and for 21 randomly selected bacterial genomes. These predictions were compared with results obtained from HGT DB (based on the compositional approach) and PAI DB (based on the combined approach). The results obtained show that intersecting n-grams with other compositional features improves relative precision by up to 10% in case of HGT DB and up to 60% in case of PAI DB. In addition, it was demonstrated that the union of all compositional features results in maximum recall (up to 37%). Thus, the application of n-gram analysis alongside existing or newly developed methods may improve the prediction of GI/PAIs. |
URI: | https://research.matf.bg.ac.rs/handle/123456789/316 | ISSN: | 15320464 | DOI: | 10.1016/j.jbi.2008.03.007 |
Appears in Collections: | Research outputs |
Show full item record
SCOPUSTM
Citations
4
checked on Dec 18, 2024
Page view(s)
12
checked on Dec 25, 2024
Google ScholarTM
Check
Altmetric
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.