Authors:
Muskan Garg
1
;
Mukesh Kumar
2
and
Debabrata Samanta
3
Affiliations:
1
Artificial Intelligence & Informatics, Mayo Clinic, Rochester, MN, U.S.A.
;
2
University Institute of Engineering & Technology, Panjab University, Chandigarh, India
;
3
Department of Computational Information Technology, Rochester Institute of Technology, Kosovo
Keyword(s):
Co-Word Analysis, Graph Theory, Information Retrieval, Linguistic Analysis, Pattern Recognition.
Abstract:
A surge in text-based information retrieval such as topic detection and tracking has increasingly shown growth from static to dynamism in the last decade. We posit the need of investigating an interdisciplinary approach of network science and natural language processing for graph-based information extraction. Post-lockdown era, it makes sense to consider Graph of Words (GoW) evolved from user-generated text from social media platforms amid increase in the internet traffic. The idea is to unfold the latent patterns in graph-based text representation with limited resource availability resulting in effective models, in comparison of computationally expensive pre-trained models, limited to a certain type of information extraction. As a solution towards advancing statistical approach for language independent models, we plot three different information retrieval applications: (i) Structural analysis: find unique patterns in domain/ language/ genre-specific GoW for keyword extraction, (ii)
Language independence: design objective function for language-independent information retrieval, (iii) Dynamism: mathematical modeling for concept-drift and evolving trends/ events in dynamic GoW evolved from streaming data. We associate recent developments and open challenges with our position as potential research direction.
(More)