Repository logo
Author Profile prof. dr hab. inż.

Wiatr, Kazimierz

Loading...
Profile Picture

Email Address

Employee

aktywny

Alternative name

Discipline

informatyka techniczna i telekomunikacja
automatyka, elektronika, elektrotechnika i technologie kosmiczne

Search Results

Now showing 1 - 10 of 22
  • Item type:Article, Access status: Open Access ,
    Using standard hardware accelerators to decrease computation times in scientific applications
    (Wydawnictwa AGH, 2009) Kuna, Dawid; Jamro, Ernest; Russek, Paweł; Wiatr, Kazimierz
    Nowadays, general-purpose processors are being used in scientific computing. However, when high computational throughput is needed, it's worth to think it over if dedicated hardware solutions would be more efficient, either in terms of performance (or performance to price ratio), or in terms of power efficiency, or both. This paper describes them briefly and compares to contemporary general-purpose processors' architecture.
  • Item type:Article, Access status: Open Access ,
    Compressing sentiment analysis CNN models for efficient hardware processing
    (Wydawnictwa AGH, 2020) Wróbel, Krzysztof; Karwatowski, Michał; Wielgosz, Maciej; Pietroń, Marcin; Wiatr, Kazimierz
    Convolutional neural networks (CNNs) were created for image classification tasks. Shortly after their creation, they were applied to other domains, including natural language processing (NLP). Nowadays, solutions based on artificial intelligence appear on mobile devices and embedded systems, which places constraints on memory and power consumption, among others. Due to CNN memory and computing requirements, it is necessary to compress them in order to be mapped to the hardware. This paper presents the results of the compression of efficient CNNs for sentiment analysis. The main steps involve pruning and quantization. The process of mapping the compressed network to an FPGA and the results of this implementation are described. The conducted simulations showed that the 5-bit width is enough to ensure no drop in accuracy when compared to the floating-point version of the network. Additionally, the memory footprint was significantly reduced (between 85 and 93% as compared to the original model).
  • Item type:Article, Access status: Open Access ,
    Sprzętowa implementacja części wielomianowej funkcji orbitalnej na potrzeby obliczeń kwantowo-chemicznych
    (Wydawnictwa AGH, 2010) Wielgosz, Maciej; Jamro, Ernest; Russek, Paweł; Wiatr, Kazimierz
    The hardware acceleration module for generating the polynomial part of the orbital function in quantum chemistry calculation is presented. Both implementation and acceleration results are provided in the paper along with the comparison tests (against Itanium 2 processor). The implementation described can be regarded as a milestone on the way towards introducing an efficient hardware implementation of the exchange-correlation potential. The FPGA-based SGI RASC accelerator was used to offload a processor in the most exhausting computations of the SCF routine. The paper also covers issues regarding an integration of the PP (polynomial part) module with the rest of the computational system.
  • Item type:Article, Access status: Open Access ,
    Prototyp systemu profilowania pętli kodu źródłowego jako narzędzia analizy kodu w celu efektywnego przyspieszenia obliczeń wielkiej skali
    (Wydawnictwa AGH, 2010) Pietroń, Marcin; Russek, Paweł; Wiatr, Kazimierz
    This paper presents the research on FPGA based acceleration of HPC applications. The most important step to achieve this goal is to extract code that can be sped up. A major drawback is the lack of a tool which could do it. The HPC applications usually consist of a huge amount of complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLL (High Level Languages) such as Mitrion-C and Impulse-C. Loop profiling is one of the steps to check if the insertion of HLL to existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with the minimal initialization of it during the execution of algorithms.
  • Item type:Article, Access status: Open Access ,
    Softprocesor wizyjny z rekonfigurowalną listą instrukcji
    (Wydawnictwa AGH, 2006) Kwiatkowski, Marek; Kołton, Mariusz; Russek, Paweł; Wiatr, Kazimierz
    The Authors present hardware solution implemented in FPGA reconfigurable logic which is a proposal of a universal platform for the image processing. Dedicated hardware is a traditional solution in image processing area as an alternative to the software methods because it offers high processing power/hardware resources ratio. The common disadvantages of that approach is time consuming implementation time. Presented processor with reconfigurable instruction set is a compromise between software and hardware. It offers easier design flow. In the paper discrete cosine transform implementation is presented as an example.
  • Item type:Article, Access status: Open Access ,
    Accelerating SELECT WHERE and SELECT JOIN queries on a GPU
    (Wydawnictwa AGH, 2013) Pietroń, Marcin; Russek, Paweł; Wiatr, Kazimierz
    This paper presents implementations of a few selected SQL operations using the CUDA programming framework on the GPU platform. Nowadays, the GPU’s parallel architectures give a high speed-up on certain problems. Therefore, the number of non-graphical problems that can be run and sped-up on the GPU still increases. Especially, there has been a lot of research in data mining on GPUs. In many cases it proves the advantage of offloading processing from the CPU to the GPU. At the beginning of our project we chose the set of SELECT WHERE and SELECT JOIN instructions as the most common operations used in databases. We parallelized these SQL operations using three main mechanisms in CUDA: thread group hierarchy, shared memories, and barrier synchronization. Our results show that the implemented highly parallel SELECT WHERE and SELECT JOIN operations on the GPU platform can be significantly faster than the sequential one in a database system run on the CPU.
  • Item type:Article, Access status: Open Access ,
    Implementacja kodeka standardu MPEG-2 w układach FPGA
    (Wydawnictwa AGH, 2008) Dąbrowska-Boruch, Agnieszka; Wiatr, Kazimierz
    The compression method applied in MPEG-2 standard is a combination of different standards, namely JPEG and H.261. There is possible to use similar compression techniques how in case of the JPEG standard, because the video signal is a sequence of still pictures. Paper presents implementation results of video signal processing path compatible with ISO/IEC 13818 standard specification in XC2VP100(-6)FF1704 Xilinx chip.
  • Item type:Article, Access status: Open Access ,
    Modyfikacja algorytmu E3SS estymacji ruchu na potrzeby implementacji w układach FPGA
    (Wydawnictwa AGH, 2006) Dąbrowska-Boruch, Agnieszka; Wiatr, Kazimierz
    Motion estimation is a process calculating the Shift between the macroblock of the current picture and the most similar macroblocks from the corresponding pictures. Motion estimation is an important element of those compression algorithms that deal with sequences of pictures. Examples of such algorithms are H.26x and MPEG. In the article basic issues and the main algorithms of motion estimation have been discussed. Additionally paper presents modification of Efficient Three-Step Search algorithm.
  • Item type:Article, Access status: Open Access ,
    Computation acceleration on SGI RASC: FPGA based reconfigurable computing hardware
    (Wydawnictwa AGH, 2008) Jamro, Ernest; Janiszewski, Marcin; Machaczek, Krzysztof; Russek, Paweł; Wiatr, Kazimierz; Wielgosz, Maciej
    In this paper a novel method of computation using FPGA technology is presented. In several cases this method provides a calculations speedup with respcct to the General Purpose Processors (GPP). The main concept of this approach is based on such a design of computing hardware architecture to fit algorithm dataflow and best utilize well known computing techniques as pipelining and parallelism. Configurable hardware is used as a implementation platform for custom designed hardware. Paper will present implementation results of algorithms those are used in such areas as cryptography, data analysis and scientific computation. The other promising areas of new technology utilization will also be mentioned, bioinformatics for instance. Mentioned algorithms were designed, tested and implemented on SGI RASC platform. RASC module is a part of Cyfronet's SGI Altix 4700 SMP system. We will also present RASC modern architecture. In principle it consists of FPGA chips and very fast, 128-bit wide local memory. Design tools avaliable for designers will also be presented.
  • Item type:Article, Access status: Open Access ,
    Potokowe przetwarzanie obrazów w oparciu o środowisko EDK i magistralę OPB
    (Wydawnictwa AGH, 2006) Jamro, Ernest; Wiatr, Kazimierz
    This paper introduces a novel architecture denoted as On-chip Pipeline Architecture (OpiAr). The OPiAr is used for pipeline low-level image processing in Field Programmable Gate Arrays (FPGAs). The architecture OPiAr employs On-chip Peripheral Bus (OPB) developed by IBM and Xilinx Embedded Development Kit (EDK) and it is a modification of Dedicated Pipeline Architecture (DePiAr). Pipeline Image processing, as it was shown for the DePiAr, reduces external memory access and facilitates low-level image processing.