Automated citation sentiment analysis using high order n-grams: a 
preliminary investigation

MUHAMMAD TOUSEEF IKRAM; MUHAMMAD TANVIR AFZAL; NAVEED ANWER BUTT

Automated citation sentiment analysis using high order n-grams: a preliminary investigation

Authors: MUHAMMAD TOUSEEF IKRAM, MUHAMMAD TANVIR AFZAL, NAVEED ANWER BUTT

Abstract: Scientific papers hold an association with previous research contributions (i.e. books, journals or conference papers, and web resources) in the form of citations. Citations are deemed as a link or relatedness of the previous work to the cited work. The nature of the cited material could be supportive (positive), contrastive (negative), or objective (neutral). Extraction of the author's sentiment towards the cited scientific articles is an emerging research discipline due to various linguistic differences between the citation sentences and other domains of sentiment analysis. In this paper, we propose a technique for the identification of the sentiment of the citing author towards the cited paper by extracting unigram, bigram, trigram, and pentagram adjective and adverb patterns from the citation text. After POS tagging of the citation text, we use the sentence parser for the extraction of linguistic features comprising adjectives, adverbs, and n-grams from the citation text. A sentiment score is then assigned to distinguish them as positive, negative, and neutral. In addition, the proposed technique is compared with manually classified citation text and 2 commercial tools, namely SEMANTRIA and THEYSAY, to determine their applicability to the citation corpus. These tools are based on different techniques for determining the sentiment orientation of the sentence. Analysis of the results shows that our proposed approach has achieved results comparable to the commercial counterparts with average precision, recall, and accuracy of 90{\%}, 81.82{\%}, and 85.91{\%} respectively.

Keywords: Citation sentiment analysis, n-gram analysis, citation classification, SEMANTRIA, THEYSAY

Full Text: PDF