Authors:
Evripides Christodoulou
1
;
Andreas Gregoriades
1
;
Maria Pampaka
2
and
Herodotos Herodotou
1
Affiliations:
1
Cyprus University of Technology, Limassol, Cyprus
;
2
The University of Manchester, Manchester, U.K.
Keyword(s):
XGBoost, Topic Analysis, Word2Vec, Revisit Intention, Data Mining, Tourists’ Reviews.
Abstract:
Revisit intention is a key indicator for future business performance in the hospitality industry. This work focuses on the identification of patterns from user-generated data explaining the reasons why tourist may revisit a hotel they stayed at during their holidays and aims to identify differences among two classes of hotels (4-5 star and 2-3 star). The method utilises data from TripAdvisor retrieved using a scrapper application. Topic modelling is initially performed to identify the main themes discussed in each tourist review. Subsequently, reviews are labelled depending on whether they mention the intention of their author to revisit the hotel in the future using an ontology of revisit-intention generated using Word2Vec word embedding. The identified topics from the labelled reviews are utilised to train an Extreme Gradient Boosting model (XGBoost) to predict revisit intention, which is then used to identify topic-patterns in reviews that relate to revisit intention. The learned
model achieved satisfactory performance and was used to identify the most influential topics related to revisit intention using an explainable machine learning technique to illustrate visually the rules embedded in the learned XGBoost model. The method is applied on reviews from tourists that visited Cyprus between 2009-2019. Results highlight that staff professionalism (e.g., politeness, smile) is critical for both classes of hotels; however, its effect is smaller on 2-3 start hotels where cleanliness has greater influence on revisiting.
(More)