Repository logo
Journal Issue

Computer Science

Loading...
Thumbnail Image
ISSN 1508-2806
e-ISSN: 2300-7036

Issue Date

2015

Volume

Vol. 16

Number

No. 2

Access rights

Access: otwarty dostęp
Rights: CC BY 4.0
Attribution 4.0 International

Attribution 4.0 International (CC BY 4.0)

Description

Reviewed by: Dariusz Madej, Adam Wierzbicki, Adam Jatowt, Krzysztof Marasek, Maria Orłowska, Yanchun Zhang. This issue was edited by: Maria Orłowska, Krzysztof Marasek and Adam Wierzbicki

Journal Volume

Item type:Journal Volume,
Computer Science
Vol. 16 (2015)

Projects

Pages

Articles

Item type:Article, Access status: Open Access ,
Pre-trained Deep Neural Network using Sparse Autoencoders and Scattering Wavelet Transform for musical genre recognition
(Wydawnictwa AGH, 2015) Kleć, Mariusz; Koržinek, Danijel
Research described in this paper tries to combine the approach of Deep Neural Networks (DNN) with the novel audio features extracted using the Scattering Wavelet Transform (SWT) for classifying musical genres. The SWT uses a sequence of Wavelet Transforms to compute the modulation spectrum coefficients of multiple orders, which has already shown to be promising for this task. The DNN in this work uses pre-trained layers using Sparse Autoencoders (SAE). Data obtained from the Creative Commons website jamendo.com is used to boost the well-known GTZAN database, which is a standard benchmark for this task. The final classifier is tested using a 10-fold cross validation to achieve results similar to other state-of-the-art approaches.
Item type:Article, Access status: Open Access ,
Application of linguistic cues in the analysis of language of hate groups
(Wydawnictwa AGH, 2015) Balcerzak, Bartłomiej; Jaworski, Wojciech
Hate speech and fringe ideologies are social phenomena that thrive on-line. Members of the political and religious fringe are able to propagate their ideas via the Internet with less effort than in traditional media. In this article, we attempt to use linguistic cues such as the occurrence of certain parts of speech in order to distinguish the language of fringe groups from strictly informative sources. The aim of this research is to provide a preliminary model for identifying deceptive materials online. Examples of these would include aggressive marketing and hate speech. For the sake of this paper, we aim to focus on the political aspect. Our research has shown that information about sentence length and the occurrence of adjectives and adverbs can provide information for the identification of differences between the language of fringe political groups and mainstream media.
Item type:Article, Access status: Open Access ,
Automated credibility assessment on Twitter
(Wydawnictwa AGH, 2015) Lorek, Krzysztof; Wiciński, Jacek; Jankowski-Lorek, Michał; Gupta, Amit
In this paper, we make a practical approach to automated credibility assessment on Twitter. We describe the process behind the design of an automated classifier for information credibility assessment. As an addition, we propose practical implementation of TwitterBOT, a tool which is able to score submitted tweets while working in the native Twitter interface.
Item type:Article, Access status: Open Access ,
Noisy-parallel and comparable corpora filtering methodology for the extraction of bi-lingual equivalent data at sentence level
(Wydawnictwa AGH, 2015) Wołk, Krzysztof
Text alignment and text quality are critical to the accuracy of Machine Translation (MT) systems, some NLP tools, and any other text processing tasks requiring bilingual data. This research proposes a language-independent bisentence filtering approach based on Polish (not a position-sensitive language) to English experiments. This cleaning approach was developed on the TED Talks corpus and also initially tested on the Wikipedia comparable corpus, but it can be used for any text domain or language pair. The proposed approach implements various heuristics for sentence comparison. Some of the heuristics leverage synonyms as well as semantic and structural analysis of text as additional information. Minimization of data loss has been? ensured. An improvement in MT system scores with text processed using this tool is discussed.
Item type:Article, Access status: Open Access ,
Document controversy classification based on the Wikipedia category structure
(Wydawnictwa AGH, 2015) Jankowski-Lorek, Michał; Zieliński, Kazimierz
Dispute and controversy are parts of our culture and cannot be omitted on the Internet (where it becomes more anonymous). There have been many studies on controversy, especially on social networks such as Wikipedia. This free on-line encyclopedia has become a very popular data source among many researchers studying behavior or natural language processing. This paper presents using the category structure of Wikipedia to determine the controversy of a single article. This is the first part of the proposed system for classification of topic controversy score for any given text.

Keywords