Computer Science
Loading...
ISSN 1508-2806
e-ISSN: 2300-7036
Call number
Volume
Vol. 10
Date
2009
Description
Journal
Computer Science
AGH University Press (2004-)
ISSN: 1508-2806 e-ISSN: 2300-7036
ISSN: 1508-2806 e-ISSN: 2300-7036
journal.volume.project
Contains
Journal Issues
Articles
Usage of dedicated data structures for URL databases in a large-scale crawling
(Wydawnictwa AGH, 2009) Dorosz, Krzysztof
The article discuss usage of Berkeley DB data structures such as hash tables and b-trees for implementation of a high performance URL database. The article presents a formal model for a data structures oriented URL database, which can be used as an alternative for a relational oriented URL database.
Enhancing regular expressions for Polish text processing
(Wydawnictwa AGH, 2009) Dorosz, Krzysztof; Szczerbińska, Anna
The paper presents proposition of regular expressions engine based on the modified Thompson's algorithm dedicated to the Polish language processing. The Polish inflectional dictionary has been used for enhancing regular expressions engine and syntax. Instead of using characters as a basic element of regular expressions patterns (as it takes place in BRE or ERE standards) presented tool gives possibility of using words from a natural language or labels describing words grammar properties in regex syntax.
Automatyczna kontekstowa korekta tekstów z wykorzystaniem grafu LHG
(Wydawnictwa AGH, 2009) Gadamer, Marcin; Horzyk, Adrian
Automatic text correction is an essential problem of today text processors and editors. This paper introduces a novel algorithm for automation of contextual text correction using a Linguistic Habit Graph (LHG) also introduced in this paper. A specialist internet crawler has been constructed for searching through web sites in order to build a Linguistic Habit Graph after text corpuses gathered in Polish web sites. The achieved correction results on a basis of this algorithm using this LHG were compared with commercial programs which also enable to make text correction: Microsoft Word 2007, Open Office Writer 3.0 and search engine Google. The achieved results of text correction were much better than correction made by these commercial tools.
Parallel algorithm for sorting animal pedigrees
(Wydawnictwa AGH, 2009) Gierdziewicz, Maciej
In many analyses of animal genotype with the methods of quantitative genetics there is a need to account for relationships among individuals. Incorrectly calculated relationship coefficients may lead to biased estimates. The number of software packages exist which deal with that problem, however, in many of them it is assumed that pedigrees of the individuals are sorted chronologically, but in real data sets - containing information on traits and pedigrees - birth dates are often missing. In extreme cases, when (almost) no birth dates are present, the ordering must be made by comparing - at least once - each pair of individuals separately, since it is not sufficient to compare adjacent elements in order to check whether the data set is sorted. Two versions of parallel computer programs were compared, with constant or variable distance between elements of compared pairs. The results indicate that the second algorithm is more efficient.
Using standard hardware accelerators to decrease computation times in scientific applications
(Wydawnictwa AGH, 2009) Kuna, Dawid; Jamro, Ernest; Russek, Paweł; Wiatr, Kazimierz
Nowadays, general-purpose processors are being used in scientific computing. However, when high computational throughput is needed, it's worth to think it over if dedicated hardware solutions would be more efficient, either in terms of performance (or performance to price ratio), or in terms of power efficiency, or both. This paper describes them briefly and compares to contemporary general-purpose processors' architecture.

