Browsing by Subject "q-grams"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item type:Article, Access status: Open Access , Making dense codes even denser(Wydawnictwa AGH, 2008) Grabowski, SzymonDense byte oriented compression codes are a useful tool for compressing textual databases over a large alphabet. The requirement for large alphabet is naturally fulfilled for most human languages, where the symbols can be words, but also non segmented texts can be handled similarly, using $q$-grams. Recently, several interesting schemes, combining speed, high compression ratios, fast search support and simplicity, have been presented. In this work, we show a couple of simple ideas increasing slightly the compression ratios of common byte codes, like ($s,c$)-DC or tagged Huffman, assuming the text is static. Preliminary experimental results with one of those techniques show that it is more efficient with $q$-gram compression, and the compression ratio improves in those cases often by 1% or more, without compromising the search or decoding efficiency and simplicity.
