Repository logo
Article

Impact of n-stage latent Dirichlet allocation on analysis of headline classification

creativeworkseries.issn1508-2806
dc.contributor.authorGüven, Zekeriya Anil
dc.contributor.authorDiri, Banu
dc.contributor.authorÇakaloğlu, Tolgahan
dc.date.available2025-06-20T06:01:31Z
dc.date.issued2022
dc.descriptionBibliogr. s. 392-394.
dc.description.abstractData analysis becomes difficult when the amount of the data increases. More specifically, extracting meaningful insights from this vast amount of data and grouping it based on its shared features without human intervention requires advanced methodologies. There are topic-modeling methods that help overcome this problem in text analyses for downstream tasks (such as sentiment analysis, spam detection, and news classification). In this research, we benchmark several classifiers (namely, random forest, AdaBoost, naive Bayes, and logistic regression) using the classical latent Dirichlet allocation (LDA) and n-stage LDA topic-modeling methods for feature extraction in headline classification. We ran our experiments on three and five classes of publicly available Turkish and English datasets. We have demonstrated that, as a feature extractor, $n$-stage LDA obtains state-of-the-art performance for any downstream classifier. It should also be noted that random forest was the most successful algorithm for both datasets.en
dc.description.placeOfPublicationKraków
dc.description.versionwersja wydawnicza
dc.identifier.doihttps://doi.org/10.7494/csci.2022.23.3.4622
dc.identifier.eissn2300-7036
dc.identifier.issn1508-2806
dc.identifier.urihttps://repo.agh.edu.pl/handle/AGH/113311
dc.language.isoeng
dc.publisherWydawnictwa AGH
dc.relation.ispartofComputer Science
dc.rightsAttribution 4.0 International
dc.rights.accessotwarty dostęp
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/legalcode
dc.subjecttopic modelingen
dc.subjectheadline classificationen
dc.subjectmachine learningen
dc.subjecttext classificationen
dc.subjectlatent Dirichlet allocationen
dc.subjectdata analysisen
dc.titleImpact of n-stage latent Dirichlet allocation on analysis of headline classificationen
dc.title.relatedComputer Scienceen
dc.typeartykuł
dspace.entity.typePublication
publicationissue.issueNumberNo. 3
publicationissue.paginationpp. 375-394
publicationvolume.volumeNumberVol. 23
relation.isJournalIssueOfPublication19f2aab8-50a9-4121-8881-5e38e346b24f
relation.isJournalIssueOfPublication.latestForDiscovery19f2aab8-50a9-4121-8881-5e38e346b24f
relation.isJournalOfPublication020291ee-249b-4dcf-98a3-276a2f7981aa

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
csci.2022.23.3.375.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format