e-ISSN:0976-5166
p-ISSN:2231-3850


INDIAN JOURNAL OF COMPUTER SCIENCE AND ENGINEERING

Call for Papers 2024

Feb 2024 - Volume 15, Issue 1
Deadline: 15 Jan 2024
Publication: 20 Feb 2024

Apr 2024 - Volume 15, Issue 2
Deadline: 15 Mar 2024
Publication: 20 Apr 2024

More

 

ABSTRACT

Title : SHORT TEXT TOPIC MODELING WITH EMPIRICAL LEARNING
Authors : Supriya A. Kinariwala, Sachin N. Deshmukh
Keywords : Topic Modeling, Short text, Latent Dirichlet Allocation, Non-negative Matrix factorization, Semantic assisted NMF.
Issue Date : Sep-Oct 2020
Abstract :
In the present modern digital era, use of social media has been increasing exponentially. People have started using short text for expressing their thoughts. Social media websites like Twitter, Facebook are generating vast amount of short text at every second that reveals good knowledge of real time information. Extensive research is going on to discover knowledge from it. Short text is very sparse and ambiguous; hence there is a big challenge to find latent topics from it. This can be resolved by using unsupervised machine learning approach referred as topic modeling. This paper covers various topic modeling methods like Latent Dirichlet Allocation (LDA), Non-negative Matrix Factorization (NMF), and Semantics-assisted Non-negative Matrix Factorization (SeaNMF) and their comparative analysis. These three methods have been tested on ABCNews headline dataset, results have been analyzed using average Normalized Google Distance (NGD) score; which is 67.88%, 58.60%, 59.32% for SeaNMF, NMF and LDA respectively. The quantitative result shows that more meaningful and semantically similar words are clustered under each topic by SeaNMF model.
Page(s) : 510-516
ISSN : 0976-5166
Source : Vol. 11, No.5
PDF : Download
DOI : 10.21817/indjcse/2020/v11i5/201105168