MINING SOCIAL MEDIA CROWD TRENDS FROM THAI TEXT POSTS AND COMMENTS
Main Article Content
Abstract
Text mining from social media stream has attracted wide interests from both businesses and academics. Very large numbers of self-posts from crowd sources contains hidden trends, which can be valuable to a business enterprise. Crowd trend mining methods, together with an easily understood visual presentation, are thus in great demand. We present an approach to mining crowd trends from Thai text posts. A Thai language preprocessing module was necessary to transform continuous text into series of words. Our method could then mine general unforeseen crowd trends by using an automatic context extraction technique, tf-idf score and an aggregated opinion score calculated from automatically classified sentiments for each post or comment. The best sentiment classifier was chosen based on extensive experiments on the same data source. These scores were combined into one unified term popularity which was visualized as a word cloud on a web application. A case study used a popular Thai discussion website - Pantip.com - and achieved three interwoven desired goals: (1) extraction of general and unforeseen crowd trends from a Thai discussion website, (2) assigning unified popularity scores to each candidate term and (3) presenting those terms to end users in an easily comprehended form.