Author:
Avi Bleiweiss
Affiliation:
BShalem Research, United States
Keyword(s):
Word Vectors, Deep Learning, Semantic Matching, Multidimensional Scaling, Clustering.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Artificial Intelligence
;
Bioinformatics
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computational Intelligence
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Methodologies and Methods
;
Natural Language Processing
;
Neural Networks
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
;
Theory and Methods
;
Visualization
Abstract:
Semantic word embeddings have shown to cluster in space based on linguistic similarities that are quantifiably
captured using simple vector arithmetic. Recently, methods for learning distributed word vectors have progressively
empowered neural language models to compute compositional vector representations for phrases
of variable length. However, they remain limited in expressing more generic relatedness between instances of
a larger and non-uniform sized body-of-text. In this work, we propose a formulation that combines a word
vector set of variable cardinality to represent a verse or a sentence, with an iterative distance metric to evaluate
similarity in pairs of non-conforming verse matrices. In contrast to baselines characterized by a bag of
features, our model preserves word order and is more sustainable in performing semantic matching at any
of a verse, chapter and book levels. Using our framework to train word vectors, we analyzed the clustering
of bible books exploring multid
imensional scaling for visualization, and experimented with book searches
of both contiguous and out-of-order parts of verses. We report robust results that support our intuition for
measuring book-to-book and verse-to-book similarity.
(More)