A Framework for Massive Twitter Data Extraction and Analysis
Main Article Content
Abstract
Social networks surfaced as communication and socialization tools. The vast amount of data these networks generate has led to a growing need of automatic knowledge extraction. The popular nature of these services is ideal for trends discovery. In particular, Twitter offers an open environment where people all around the world share information and opinions, emerging as a real-time repository of knowledge that can be exploited by researchers and applications. We propose an open framework to automatically collect and analyze data from Twitter’s public streams. This is a customizable and extensible framework, so researchers can use it to test new techniques. The framework is complemented with a language-agnostic sentiment analysis module, which provides a set of tools to perform sentiment analysis of the collected tweets. The capabilities of this platform are illustrated with two study cases in Spanish, one related to a high impact event (the Boston Terror Attack), and another one related to regular political activity on Twitter.