e-ISSN:0976-5166
p-ISSN:2231-3850


INDIAN JOURNAL OF COMPUTER SCIENCE AND ENGINEERING

Call for Papers 2020

Dec 2020 - Volume 11, Issue 6
Deadline: 15 Nov 2020
Publication: 20 Dec 2020

Feb 2021 - Volume 12, Issue 1
Deadline: 15 Jan 2021
Publication: 20 Feb 2021

More

Indexed in

IJCSE Indexed in Scopus

ABSTRACT

Title : SHORT TEXT TOPIC MODELING WITH EMPIRICAL LEARNING
Authors : Supriya A. Kinariwala, Sachin N. Deshmukh
Keywords : Topic Modeling, Short text, Latent Dirichlet Allocation, Non-negative Matrix factorization, Semantic assisted NMF.
Issue Date : Sep-Oct 2020
Abstract :
In the present modern digital era, use of social media has been increasing exponentially. People have started using short text for expressing their thoughts. Social media websites like Twitter, Facebook are generating vast amount of short text at every second that reveals good knowledge of real time information. Extensive research is going on to discover knowledge from it. Short text is very sparse and ambiguous; hence there is a big challenge to find latent topics from it. This can be resolved by using unsupervised machine learning approach referred as topic modeling. This paper covers various topic modeling methods like Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and Semantics-assisted Non-negative Matrix Factorization (SeaNMF) and their comparative analysis. These three methods have been tested on ABCNews headline dataset, results have been analyzed using average Normalized Google Distance (NGD) score; which is 67.88%, 58.60%, 59.32% for SeaNMF, NMF and LDA respectively. The quantitative result shows that more meaningful and semantically similar words are clustered under each topic by SeaNMF model.
Page(s) : 510-516
ISSN : 0976-5166
Source : Vol. 11, No.5
PDF : Download
DOI : 10.21817/indjcse/2020/v11i5/201105168