Skip to main navigation Skip to search Skip to main content

Sentiment classification of Internet restaurant reviews written in Cantonese

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Cantonese is an important dialect in some regions of Southern China. Local online users often represent their opinions and experiences on the web with written Cantonese. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product and the Cantonese reviews are unintelligible for Mandarin Chinese speakers. In this paper, standard machine learning techniques naive Bayes and SVM are incorporated into the domain of online Cantonese-written restaurant reviews to automatically classify user reviews as positive or negative. The effects of feature presentations and feature sizes on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The naive Bayes classifier achieves as well as or better accuracy than SVM. Character-based bigrams are proved better features than unigrams and trigrams in capturing Cantonese sentiment orientation.

Original languageEnglish
Pages (from-to)7674-7682
Number of pages9
JournalExpert Systems with Applications
Volume38
Issue number6
DOIs
StatePublished - Jun 2011

Keywords

  • Cantonese
  • Machine learning
  • Online review
  • Restaurant
  • Sentiment classification

Fingerprint

Dive into the research topics of 'Sentiment classification of Internet restaurant reviews written in Cantonese'. Together they form a unique fingerprint.

Cite this