The impact of text preprocessing on the prediction of review ratings

Authors: MUHİTTİN IŞIK, HASAN DAĞ

Abstract: With the increase of e-commerce platforms and online applications, businessmen are looking to have a rating and review system through which they can easily reveal the feelings of customers related to their products and services. It is undeniable from the statistics that online ratings and reviews attract new customers as well as increase sales by means of providing confidence, ratification, opinions, comparisons, merchant credibility, etc. Although considerable research has been devoted to the sentiment analysis for review classification, rather less attention has been paid to the text preprocessing which is a crucial step in opinion mining especially if convenient preprocessing strategies are found out to increase the classification accuracy. In this paper, we concentrate on the impact of simple text preprocessing decisions in order to predict fine-grained review rating stars whereas the majority of previous work focused on the binary distinction of positive vs. negative. Therefore, the aim of this research is to analyze preprocessing techniques and their influence, at the same time explain the interesting observations and results on the performance of a five-class-based review rating classifier.

Keywords: Text preprocessing, sentiment analysis, opinion mining, review rating, text mining.

Full Text: PDF